Adding ID Assignment to rustsec-admin

Introduction

rustsec-admin is an administrative utility for maintaining the RustSec advisory database which is a collection of advisories with regard to Rust maintained by the Rust Secure Code Working Group. For this project, there was a desire to add a command by which IDs could automatically be assigned to new vulnerability advisory files. To execute on this wish, I began work on a PR that would implement it.

Implementation

At first, I copied the skeleton of the lint command as I wasn't very familiar with the assumptions of the rustsec-admin tool. Copying that entailed implementing a new value in the AdminCmd enum. Specifically, I added the following:


    /// `rustsec-admin` CLI subcommands
    #[derive(Command, Debug, Options, Runnable)]
    pub enum AdminCmd {
        /// The `lint` subcommand
        #[options(help = "lint Advisory DB and ensure is well-formed")]
        Lint(LintCmd),

        /// The `web` subcommand
        #[options(help = "render advisory Markdown files for the rustsec.org web site")]
        Web(WebCmd),

        /// The `help` subcommand
        #[options(help = "get usage information")]
        Help(Help<Self>),

        /// The `version` subcommand
        #[options(help = "display version information")]
        Version(VersionCmd),

        /// The `assign-id` subcommand
        #[options(help = "assigning RUSTSEC ids to new vulnerabilities")]
        AssignId(AssignIdCmd), //Added this line
    }

Of course, the AssignIdCmd in that statement required me to add a command elsewhere in the code base. To do that, I more or less copied the approach of the LintCmd implemented elsewhere to set up an implementation of the Runnable trait and the AssignIdCmd struct itself.

Specifically, I set up the AssignIdCmd struct to have a path field which would outline where the vulnerability advisory repository lies on the user system. The Runnable trait meanwhile incorporates a run function which is set up to adapt to 2 usecases in particular: the first case is rustsec-admin assign-id, and the second case is rustsec-admin assign-id <insert repository path>. In the first case, the repository path is assumed to be the current directory while in the second case, the repository path is what the user declares. In the case where the user deviates from the aforementioned cases, the expected usage is printed out and the program exits.


    /// The implementation of what is described above
    #[derive(Command, Debug, Default, Options)]
    pub struct AssignIdCmd {
        /// Path to the advisory database
        #[options(free, help = "filesystem path to the RustSec advisory DB git repo")]
        path: Vec<PathBuf>,
    }

    impl Runnable for AssignIdCmd {
        fn run(&self) {
            let repo_path = match self.path.len() {
                0 => Path::new("."),
                1 => self.path[0].as_path(),
                _ => Self::print_usage_and_exit(&[]),
            };

            crate::assigner::assign_ids(repo_path);
        }
    }

The assigner crate mentioned in the run function is where the meat of this implementation is. The assign_ids function is the entry point to that crate where it loads the repository and sets up an iterator over the vulnerability advisories within that repository. After that, it sets up a hash map which is used to store the greatest ID for a particular year. It then iterates over all advisories and updates the value for a particular year if the numerical ID for the current advisory is greater than the numerical ID currently stored. It will also insert a (current year, current numerical advisory ID) pair into the hash map if the hash map does not currently have the year within the hash map. After that , the strings for the collections within the vulnerability repository are gathered up and used as arguments to the assign_ids_across_directory function. This is in addition to the path to the repository and the hash map (of which a mutable reference is passed).


    /// assign ids to advisories in a particular repo_path
    pub fn assign_ids(repo_path: &Path) {
        let repo = rustsec::Repository::open(repo_path).unwrap_or_else(|e| {
            status_err!(
                "couldn't open advisory DB repo from {}: {}",
                repo_path.display(),
                e
            );
            exit(1);
        });

        // Ensure Advisories.toml parses
        let db = rustsec::Database::load(&repo).unwrap();
        let advisories = db.iter();

        // Ensure we're parsing some advisories
        if advisories.len() == 0 {
            status_err!("no advisories found!");
            exit(1);
        }

        status_ok!(
            "Loaded",
            "{} security advisories (from {})",
            advisories.len(),
            repo_path.display()
        );

        let mut highest_id = HashMap::new();

        for advisory in advisories {
            let advisory_clone = advisory.clone();
            let metadata = advisory_clone.metadata;
            let id = metadata.id;
            let year = metadata.date.to_chrono_date().unwrap().year();
            if let Kind::RUSTSEC = id.kind() {
                let id_num = id.numerical_part().unwrap();
                if let Some(&number) = highest_id.get(&year) {
                    if number < id_num {
                        highest_id.insert(year, id_num);
                    }
                } else {
                    highest_id.insert(year, id_num);
                }
            }
        }

        let mut collection_strs = vec![];
        let crates_str = Collection::Crates.to_string();
        let rust_str = Collection::Rust.to_string();
        collection_strs.push(crates_str);
        collection_strs.push(rust_str);
        for collection_str in collection_strs {
            assign_ids_across_directory(collection_str, repo_path, &mut highest_id);
        }
    }

For assign_ids_across_directory, we start by generating the path to the relevant collection of vulnerabilities of which there are two, rust (denoting vulnerabilities with the Rust language and standard library) and crates (denoting vulnerabilities within crates in the Rust ecosystem). After that, I iterated through the directory entries within the collection. Each of these directory entries constitutes a directory in which a set of vulnerability advisory files resides for a particular crate (in the case of the crates collection) or an element of the Rust language (in the case of the rust collection).


    ///Assign ids to files with placeholder IDs within the directory defined by dir_path
    fn assign_ids_across_directory(
        collection_str: String,
        repo_path: &Path,
        highest_ids: &mut HashMap<i32, u32>,
    ) {
        let dir_path = repo_path.join(collection_str);
        if let Ok(collection_entry) = fs::read_dir(dir_path) {
            for dir_entry in collection_entry {
                let unwrapped_dir_entry = dir_entry.unwrap();
                let dir_name = unwrapped_dir_entry.file_name().into_string().unwrap();
                let dir_path = unwrapped_dir_entry.path();
                let dir_path_clone = dir_path.clone();
                for advisory_entry in fs::read_dir(dir_path).unwrap() {

After that, we iterate through the vulnerability files within a directory entry so that we can determine whether or not each file is a placeholder file or not. If it is not we ignore it. If it is, we copy the file over such that all lines remain the same except the line specifying the ID which is changed such that it is the numerical ID keyed to the year of the advisory within the year->ID hashmap plus 1. We then delete the old placeholder advisory file and update the hashmap so that the year of the current advisory maps to the new ID that was just written.

                for advisory_entry in fs::read_dir(dir_path).unwrap() {
                    let unwrapped_advisory = advisory_entry.unwrap();
                    let advisory_path = unwrapped_advisory.path();
                    let advisory_path_clone = advisory_path.clone();
                    let advisory_path_for_reading = advisory_path.clone();
                    let advisory_path_for_deleting = advisory_path.clone();
                    let displayed_advisory_path = advisory_path.display();
                    let advisory_filename = unwrapped_advisory.file_name();
                    let advisory_filename_str = advisory_filename.into_string().unwrap();
                    if advisory_filename_str.contains("RUSTSEC-0000-0000") {
                        let toml = fs::read_to_string(advisory_path_clone)
                            .map_err(|e| {
                                format_err!(
                                    ErrorKind::Io,
                                    "Couldn't open {}: {}",
                                    displayed_advisory_path,
                                    e
                                );
                            })
                            .unwrap();
                        let advisory = toml.parse::<Advisory>().unwrap();
                        let date = advisory.metadata.date;
                        let year = date.to_chrono_date().unwrap().year();
                        let new_id = highest_ids.get(&year).unwrap() + 1;
                        let year_str = year.to_string();
                        let string_id = format!("RUSTSEC-{}-{:04}", year_str, new_id);
                        let new_filename = format!("{}.toml", string_id);
                        let new_path = dir_path_clone.join(new_filename);
                        let original_file = File::open(advisory_path_for_reading).unwrap();
                        let reader = BufReader::new(original_file);
                        let new_file = File::create(new_path).unwrap();
                        let mut writer = LineWriter::new(new_file);
                        for line in reader.lines() {
                            let current_line = line.unwrap();
                            if current_line.contains("id = ") {
                                writer
                                    .write_all(format!("id = \"{}\"\n", string_id).as_ref())
                                    .unwrap();
                            } else {
                                let current_line_with_newline = format!("{}\n", current_line);
                                writer
                                    .write_all(current_line_with_newline.as_ref())
                                    .unwrap();
                            }
                        }
                        highest_ids.insert(year, new_id);
                        fs::remove_file(advisory_path_for_deleting).unwrap();
                        status_ok!("Assignment", "Assigned {} to {}", string_id, dir_name);
                    }
                }
            }
        }
    }