Skip to content

Ensuring neccessary vignettes link to eachother properly#7726

Open
HunterB433 wants to merge 9 commits intoRdatatable:masterfrom
HunterB433:master
Open

Ensuring neccessary vignettes link to eachother properly#7726
HunterB433 wants to merge 9 commits intoRdatatable:masterfrom
HunterB433:master

Conversation

@HunterB433
Copy link
Copy Markdown

Closes #5833

As discussed in the chain, there isn't a clear way to link ALL the vignettes, but we can at least check that the main ones are linked together.

Our team started by looking at what vignettes were linked together and found a clump of 6.

  1. Introduction to data.table
  2. Reference semantics
  3. Keys and a fast binary-based subset
  4. Secondary indices and auto indexing
  5. Joins in data.table
  6. Using .SD for Data analysis

Seen in the diagram below, we wanted to map out how they all connected.

image

The connections were mostly satisfactory, but there were two things we felt could be improved on

  1. In "Reference Semantics," there is an instance of a hyperlink without the related R Expression (for if the user is following the vignette there)
Screenshot 2026-04-24 113624
  1. In "Joins in data.table" there isn't a link back to "Secondary indices," even though that is where the user would come from in the current reading flow
Screenshot 2026-04-24 115032

To fix both of these issues, we followed the current linking format for each file, in each language. then checked the HTML files produced via litedown::fuse("filename.Rmd") for correctness.

Here is what they looked like for problem 1:
EN)
image
ES)
Screenshot 2026-04-24 131039
FR)
Screenshot 2026-04-24 130853

Here is what they looked like for problem 2:
EN)
Screenshot 2026-04-24 115045
ES)
Screenshot 2026-04-24 130952
FR)
Screenshot 2026-04-24 130723

We noticed that for problem 2, the R expression was also translated, but only in one instance, so we kept it for the instance it was already translated, and didn't translate for the instance it wasn't.

Overall, the changed diagram isn't that different, now just with an extra connection between "Joins" to "Secondary Indices"
image

We are unsure if this will trigger any tests, as its just vignettes, but whatever happens we will get it fully working before the final PR.

Thank you for your time

@HunterB433 HunterB433 requested review from a team and MichaelChirico as code owners April 24, 2026 21:06
Copy link
Copy Markdown
Member

@aitap aitap left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for the investigation!

Here's the vignette link graph:

list.files(pattern = '[.]Rmd$') |> setNames(nm = _)  |> lapply(\(x) {
  x = readLines(x)
  x = regmatches(x, gregexec('\\(([^/)]*[.]html)\\)', x))
  x[lengths(x)>0] |> lapply(`[`, 2) |> unlist() |> unique()
})
$`datatable-benchmarking.Rmd`
NULL

$`datatable-faq.Rmd`
NULL

$`datatable-fread-and-fwrite.Rmd`
NULL

$`datatable-importing.Rmd`
NULL

$`datatable-intro.Rmd`
[1] "datatable-keys-fast-subset.html"    "datatable-reference-semantics.html"

$`datatable-joins.Rmd`
[1] "datatable-intro.html"
[2] "datatable-reference-semantics.html"
[3] "datatable-keys-fast-subset.html"
[4] "datatable-secondary-indices-and-auto-indexing.html"

$`datatable-keys-fast-subset.Rmd`
[1] "datatable-intro.html"
[2] "datatable-reference-semantics.html"
[3] "datatable-secondary-indices-and-auto-indexing.html"

$`datatable-programming.Rmd`
NULL

$`datatable-reference-semantics.Rmd`
[1] "datatable-intro.html"            "datatable-sd-usage.html"
[3] "datatable-keys-fast-subset.html"

$`datatable-reshape.Rmd`
NULL

$`datatable-sd-usage.Rmd`
[1] "datatable-reference-semantics.html"

$`datatable-secondary-indices-and-auto-indexing.Rmd`
[1] "datatable-intro.html"               "datatable-reference-semantics.html"
[3] "datatable-keys-fast-subset.html"    "datatable-joins.html"

And the following vignettes are never linked to:

list.files(pattern = '[.]Rmd$') |> sub('[.]Rmd$', '.html', x = _) |> setdiff(unlist(graph))
[1] "datatable-benchmarking.html"     "datatable-faq.html"
[3] "datatable-fread-and-fwrite.html" "datatable-importing.html"
[5] "datatable-programming.html"      "datatable-reshape.html"

Maybe they should be made part of the reading chain as well? @Rdatatable/committers I suggest the following order:

datatable-reshape.html
datatable-fread-and-fwrite.html
datatable-faq.html
datatable-importing.html
datatable-programming.html
datatable-benchmarking.html

FAQ at the end of the user topics, two vignettes for people who program with data.table, one vignette about the development process.

flights[, names(.SD) := lapply(.SD, as.factor), .SDcols = is.character]
```
Let's clean up again and convert our newly-made factor columns back into character columns. This time we will make use of `.SDcols` accepting a function to decide which columns to include. In this case, `is.factor()` will return the columns which are factors. For more on the **S**ubset of the **D**ata, there is also an [SD Usage vignette](https://cran.r-project.org/package=data.table/vignettes/datatable-sd-usage.html).
Let's clean up again and convert our newly-made factor columns back into character columns. This time we will make use of `.SDcols` accepting a function to decide which columns to include. In this case, `is.factor()` will return the columns which are factors. For more on the **S**ubset of the **D**ata, there is also an [SD Usage vignette ('vignette("datatable-sd-usage", package="data.table")')](datatable-sd-usage.html).
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Let's clean up again and convert our newly-made factor columns back into character columns. This time we will make use of `.SDcols` accepting a function to decide which columns to include. In this case, `is.factor()` will return the columns which are factors. For more on the **S**ubset of the **D**ata, there is also an [SD Usage vignette ('vignette("datatable-sd-usage", package="data.table")')](datatable-sd-usage.html).
Let's clean up again and convert our newly-made factor columns back into character columns. This time we will make use of `.SDcols` accepting a function to decide which columns to include. In this case, `is.factor()` will return the columns which are factors. For more on the **S**ubset of the **D**ata, there is also an [SD Usage vignette (`vignette("datatable-sd-usage", package="data.table")`)](datatable-sd-usage.html).

Use backticks `, not apostrophes ', to create a code block.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Hyperlinks to navigate between Vignettes?

3 participants