HSC+ Dataset

Overview

HSC+ is a large-scale AI model service dataset built from Hugging Face model services. It provides unified service metadata, refined functional annotations, QoS measurements, generated user requirements, and executable service workflows for service recommendation and service composition.

Core value

HSC+ is designed as a shared data resource for three connected research problems: service recommendation, service composition, and QoS analysis.

Construction pipeline

Hugging Face Model Collection

Collect publicly accessible Hugging Face model services.

Endpoint Probing

Check endpoint accessibility and record HTTP status and invocation results.

Metadata Extraction

Extract service name, author, description, URL, update time, and input/output parameters.

LLM-assisted Annotation

Use large language models to annotate input type, output type, and functional categories.

Human Verification

Resolve ambiguous samples through volunteer and domain-expert validation.

QoS Invocation

Invoke services repeatedly to collect response time, waiting time, reliability, and successability.

Workflow Generation

Generate syntactically valid workflows from input/output transformation constraints.

Requirement Generation

Use LLMs to generate natural-language user requirements for verified workflows.

Data schema

Field	Description
`service_name`	Service or model name.
`author`	Service author or model publisher.
`function`	Functional category that the service can satisfy.
`description`	Service description from the provider or enriched annotation.
`url`	Model or service access link.
`input_parameter`	Input data required by the service.
`output_parameter`	Output returned by the service.
`downloads`	Recent monthly download count.
`likes`	Number of user likes.
`response_time`	Time from sending a request to receiving a response.
`waiting_time`	Model initialization or loading waiting time.
`reliability`	Ratio of successful HTTP responses over all requests.
`successability`	Ratio of valid task completions that return correct usable outputs.

Statistics to publish

Service categories

Number of models under each Hugging Face task category.

Downloads and likes

Popularity distributions showing long-tail service usage patterns.

Loading / waiting time

Average loading time across model types.

Response time

Average, minimum, and maximum response time by service type.

Status code distribution

HTTP 200, 400, 403, 500, 503 and other response status ratios.

Reliability and successability

Distribution of stable successful invocations and valid task completions.

Workflow length

Complexity distribution of generated service workflows.

Requirement length and type

Diversity of natural-language user requirements generated for workflows.

Task support

Recommendation

Requirement-to-service and mashup-to-service matching with Top-K evaluation.

Composition

Input/output compatible workflow generation and QoS-aware optimization.

QoS Analysis

Response time, waiting time, reliability, successability, and status code analysis.

Download and citation

The public dataset link, version number, checksum, and license should be filled in before release.

Dataset Link Reproduction Guide

@inproceedings{hscbench2026,
  title     = {HSC-Bench: A Comprehensive Benchmark for Unified Service Recommendation and Composition Evaluation},
  author    = {TBD},
  booktitle = {TBD},
  year      = {2026}
}