Overview
Description
Overview
We have developed and published a suite of fine-tuned generative AI models designed for scalable, evidence-grounded policy monitoring and intelligence. These models are hosted on Hugging Face under the organization vtt-qsts-ai and represent reusable, operational AI components that support human-in-the-loop decision-making in policy analysis.
Key Features
• Policy-native training: Models trained specifically on science, technology, and innovation (STI) policy taxonomies and datasets derived from STIP Compass initiatives, enabling them to understand the structure and semantics of policy instruments rather than general language.
• Traceability of decisions: Outputs are grounded in evidence clusters, which makes the decision process interpretable and easy to verify by policy analysts—critical for public-sector and governance applications.
• Improved reliability: Fine-tuning on cluster-based prompts significantly outperforms zero-shot LLM ensembling (as measured by micro-F1, especially for frequent labels), demonstrating empirical gains in robustness and consistency.
• Human-in-the-loop oriented: Designed to support analysts with calibrated decisions that can be thresholded or ensembled and passed to experts for validation, rather than replacing human judgement.
• Open, reusable building blocks: Models are openly accessible and can be integrated into broader monitoring systems or adapted to domains with similarly structured taxonomies.
Models Published on Hugging Face
The following models are currently available in the vtt-qsts-ai repository on Hugging Face:
1. multilabel-indicator-classification-longformer-base
• A long-context classifier based on Longformer trained for multi-label policy indicator detection.
2. multilabel-indicator-classification-roberta-large
• A large RoBERTa-based multi-label classifier optimized for STI taxonomies.
3. multilabel-indicator-classification-roberta-base
• A smaller RoBERTa variant for efficient multi-label classification.
4. Llama-3.3-70B-Inst-ft-qlora-orig_label-validator
• Instruction-tuned LLaMA 3.3B with label validation fine-tuning for binary policy decisions.
5. Llama-3.1-70B-Inst-qlora-orig_label-validator
• Instruction-tuned LLaMA 3.1B variant with cluster-based policy validation.
6. gemma-3-27b-it-ft-stip-orig_label-validator
• Fine-tuned Gemma-3 27B model for policy validation with enhanced instruction following.
7. Qwen2.5-7B-Instruct-ft-stip-orig_label-validator
• Instruction-tuned Qwen2.5 7B model optimized for evidence-cluster based policy decisions.
8. Mistral-7B-Instruct-v0.3-ft-stip-orig_label-validator
• Instruction-tuned Mistral 7B model with policy validation capabilities.
9. Llama-3.1-8B-Inst-ft-stip-orig_label-validator
• Lightweight LLaMA 3.1 8B model with fine-tuning for policy label validation.
Impact and Reusability
These models enable operational AI-assisted systems for policy monitoring, allowing governing bodies and research infrastructures to deploy scalable solutions rapidly. Because they are built on labelled taxonomies, the approach is transferable to other domains—such as environmental regulation, healthcare policy, or social policy—where structured taxonomies and annotated evidence sets exist. The open release also provides transparent benchmarks for future methodology comparison and model improvement.
Access
All models are openly accessible at:
https://huggingface.co/vtt-qsts-ai
We have developed and published a suite of fine-tuned generative AI models designed for scalable, evidence-grounded policy monitoring and intelligence. These models are hosted on Hugging Face under the organization vtt-qsts-ai and represent reusable, operational AI components that support human-in-the-loop decision-making in policy analysis.
Key Features
• Policy-native training: Models trained specifically on science, technology, and innovation (STI) policy taxonomies and datasets derived from STIP Compass initiatives, enabling them to understand the structure and semantics of policy instruments rather than general language.
• Traceability of decisions: Outputs are grounded in evidence clusters, which makes the decision process interpretable and easy to verify by policy analysts—critical for public-sector and governance applications.
• Improved reliability: Fine-tuning on cluster-based prompts significantly outperforms zero-shot LLM ensembling (as measured by micro-F1, especially for frequent labels), demonstrating empirical gains in robustness and consistency.
• Human-in-the-loop oriented: Designed to support analysts with calibrated decisions that can be thresholded or ensembled and passed to experts for validation, rather than replacing human judgement.
• Open, reusable building blocks: Models are openly accessible and can be integrated into broader monitoring systems or adapted to domains with similarly structured taxonomies.
Models Published on Hugging Face
The following models are currently available in the vtt-qsts-ai repository on Hugging Face:
1. multilabel-indicator-classification-longformer-base
• A long-context classifier based on Longformer trained for multi-label policy indicator detection.
2. multilabel-indicator-classification-roberta-large
• A large RoBERTa-based multi-label classifier optimized for STI taxonomies.
3. multilabel-indicator-classification-roberta-base
• A smaller RoBERTa variant for efficient multi-label classification.
4. Llama-3.3-70B-Inst-ft-qlora-orig_label-validator
• Instruction-tuned LLaMA 3.3B with label validation fine-tuning for binary policy decisions.
5. Llama-3.1-70B-Inst-qlora-orig_label-validator
• Instruction-tuned LLaMA 3.1B variant with cluster-based policy validation.
6. gemma-3-27b-it-ft-stip-orig_label-validator
• Fine-tuned Gemma-3 27B model for policy validation with enhanced instruction following.
7. Qwen2.5-7B-Instruct-ft-stip-orig_label-validator
• Instruction-tuned Qwen2.5 7B model optimized for evidence-cluster based policy decisions.
8. Mistral-7B-Instruct-v0.3-ft-stip-orig_label-validator
• Instruction-tuned Mistral 7B model with policy validation capabilities.
9. Llama-3.1-8B-Inst-ft-stip-orig_label-validator
• Lightweight LLaMA 3.1 8B model with fine-tuning for policy label validation.
Impact and Reusability
These models enable operational AI-assisted systems for policy monitoring, allowing governing bodies and research infrastructures to deploy scalable solutions rapidly. Because they are built on labelled taxonomies, the approach is transferable to other domains—such as environmental regulation, healthcare policy, or social policy—where structured taxonomies and annotated evidence sets exist. The open release also provides transparent benchmarks for future methodology comparison and model improvement.
Access
All models are openly accessible at:
https://huggingface.co/vtt-qsts-ai
×
Fingerprint
Explore the research areas in which this equipment has been used. These labels are generated based on the related outputs. Together they form a unique fingerprint.