HSC-Bench

Download & Reproduction

Data links, code repositories, environment setup, evaluation scripts, and minimal commands for reproducing HSC-Bench results.

Data downloads

DatasetVersionTasksFilesDownload
ProgrammableWeb TBD Service Recommendation / Service Composition Mashup, API, invocation relations, tags, descriptions Link
QWS TBD Service Composition / QoS Optimization Response time, throughput, availability, reliability, cost-like QoS attributes Link
WS-Dream TBD QoS Prediction / Service Recommendation / Composition Response time, throughput, user-service invocation matrix Link
HSC TBD Service Recommendation / Service Composition AI model services, service workflows, QoS, function tags Link
HSC+ v1.0 draft Service Recommendation / Service Composition / QoS Function tags, input/output parameters, QoS, requirements, workflows Link
MovieLens TBD General Recommendation Baseline Users, items, ratings Link
Amazon TBD Cross-domain Recommendation Baseline Users, products, reviews, interactions Link

Code repositories

Benchmark Core

Data loaders, split definitions, evaluation scripts, and submission templates.

Repository

Recommendation Baselines

Traditional, neural, graph-based, and LLM reranking implementations.

Repository

Composition Baselines

Optimization, learning-based, and agentic workflow generation implementations.

Repository

Environment

Recommended release metadata:

  • Python: 3.10+
  • PyTorch / CUDA: fill in per model release
  • Evaluation scripts: versioned with the benchmark repository
  • Random seed and hardware: reported in every result card

Quick start

# 1. Prepare datasets
python scripts/prepare_data.py --dataset hsc_plus --version v1

# 2. Run service recommendation
python run_recommendation.py --config configs/recommendation/srlcf_hsc_plus.yaml

# 3. Run service composition
python run_composition.py --config configs/composition/gnnpn_sc_hsc_plus.yaml

# 4. Evaluate and export leaderboard rows
python scripts/evaluate.py --task recommendation --pred outputs/recommendation.jsonl
python scripts/evaluate.py --task composition --pred outputs/composition.jsonl

Reproduction checklist

Dataset version

Record exact version, checksum, and preprocessing script commit.

Unified split

Use official train/validation/test split for fair comparison.

Configuration

Publish YAML config, random seed, model hyperparameters, and hardware.

Logs and outputs

Attach raw predictions, evaluation logs, and generated leaderboard row.