Intent stubs for majority of safety intents in trait typology generated by prompting Qwen-235B#1856
Draft
aishwaryap wants to merge 1 commit into
Draft
Conversation
…ed by prompting Qwen-235B Signed-off-by: Aishwarya Padmakumar <apadmakumar@nvidia.com>
5cc057e to
2df311f
Compare
aishwaryap
commented
Jun 11, 2026
aishwaryap
left a comment
Collaborator
Author
There was a problem hiding this comment.
Meeting feedback
- We want small number of highly curated stubs rather than large number of SDG stubs
- We would like to know that current models are reasonably likely to respond (not refuse) these stubs?
- How many stubs do we need? 20-30 for a sub-intent? Min sample of 5?
- Have a provenance.md in data/cas/provenance and reference this from the README.md. Reference the stubs filenames. Include licensing info in this.
- Maybe add a test that checks that new stubs files have provenance
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Submitting intent stubs generated for majority of safety intents in trait typology.
These were generated by prompting Qwen-235B and samples from each intent were manually inspected for suitability. Not all stubs were inspected so there could be suboptimal stubs.
This PR only adds intent stub files and doesn't really attempt to use them. The goal is to get us closer to being able to add a technique and run it on a wide range of intents.
Notes for review / Goals of this PR
Tests pass and all this checks is that stub files follow the expected format for single turn json stubs and can be loaded.
Goals of this PR:
(we should update the stubs README with answers to the following)
SDG Process
These were generated by prompting Qwen-235B as follows:
For most traits
goalwas filled in with the default stub associated with the intent. For a fewgoalwas manually handcrafted based on the description.Verification
python -m pytest tests/