HSC-Bench

Service Recommendation

Task definition, datasets, models, metrics, and results for unified Top-K service recommendation evaluation.

Task definition

Service recommendation takes a user requirement, mashup description, service library, service tags, service descriptions, and optional QoS attributes as input. The output is a Top-K candidate service list. HSC-Bench documents candidate-set scope, train/validation/test split strategy, and recommendation generation logic so that models are compared under the same protocol.

Output format: ranked service IDs with scores, evaluated at configurable K values such as 5 and 10.

Recommendation datasets

DatasetTask SupportDomainMain FieldsPage Notes
ProgrammableWeb Service Recommendation / Service Composition Web API Mashup, API, invocation relations, tags, descriptions Classic Web API recommendation and composition dataset; platform availability may limit reproducibility.
HSC Service Recommendation / Service Composition AI Model Service AI model services, service workflows, QoS, function tags Hugging Face based AI service composition dataset.
HSC+ Service Recommendation / Service Composition / QoS AI Model Service Function tags, input/output parameters, QoS, requirements, workflows Core dataset of HSC-Bench for unified service computing evaluation.
MovieLens General Recommendation Baseline Recommender System Users, items, ratings Used to validate generalization of recommendation baselines.
Amazon Cross-domain Recommendation Baseline E-commerce Users, products, reviews, interactions Used for cross-domain recommendation baseline comparison.

Recommendation model library

Traditional / Statistical

Frequency

Simple frequency baseline for interpretable and reproducible recommendation.

Code
Traditional / Statistical

User-Similarity

Collaborative filtering baseline based on similar mashups or users.

Code
Traditional / Statistical

Matrix Factorization

Latent factor model for service invocation interactions.

Code
Neural Recommendation

MLP

Neural baseline over text, metadata, or interaction features.

Code
Neural Recommendation

TextCNN

Text encoder baseline for service and requirement descriptions.

Code
Neural Recommendation

T2L2

Representative learning-based service recommendation model.

Code
Neural Recommendation

MTFM

Multi-task factorization model for service recommendation.

Code
Graph-based Recommendation

NGCF

Neural graph collaborative filtering baseline.

Code
Graph-based Recommendation

HHAN

Heterogeneous hierarchical attention network baseline.

Code
Graph-based Recommendation

GSAT

Graph structure aware service recommendation baseline.

Code
Graph-based Recommendation

GSL-Mash

Graph structure learning method for mashup service recommendation.

Code
Pre-trained / Generative

T5

Text-to-text generation baseline for candidate service generation or reranking.

Code
Pre-trained / Generative

ServiceBERT

Pre-trained service representation model for matching and reranking.

Code
Pre-trained / Generative

LLM-based reranking

Large language model reranking over retrieved candidate services.

Code
Benchmark Model

SRLCF

Example benchmark model for unified evaluation workflow.

Code

Evaluation metrics

Precision@K ↑

Fraction of Top-K recommended services that match ground-truth services.

Recall@K ↑

Fraction of ground-truth services covered by the Top-K list. This is important when a mashup has multiple target services.

F1@K ↑

Harmonic mean of Precision@K and Recall@K for balanced comparison.

NDCG@K ↑

Ranking-sensitive metric that rewards correct services appearing earlier in the list.

MRR ↑

Mean reciprocal rank of the first correct recommendation, emphasizing early hits.

Recommendation results

Use the filters below to inspect the CSV-backed static leaderboard. Replace TBD values after final experiments are available.

ModelDatasetTypeP@5P@10R@5R@10NDCG@5NDCG@10MRRCodeOfficialUnified Split
SRLCF HSC+ Benchmark Model TBDTBDTBDTBDTBDTBDTBD Link Yes Yes
MTFM HSC+ Neural TBDTBDTBDTBDTBDTBDTBD Link Planned Yes
GSAT HSC+ Graph-based TBDTBDTBDTBDTBDTBDTBD Link Planned Yes
GSL-Mash ProgrammableWeb Graph-based TBDTBDTBDTBDTBDTBDTBD Link Planned TBD
Frequency HSC+ Traditional TBDTBDTBDTBDTBDTBDTBD Link Yes Yes
LLM-based reranking HSC+ LLM-based TBDTBDTBDTBDTBDTBDTBD Link Planned Yes