Export: enhance trimming using build statement metadata#3530
Conversation
setting subincludes at package level instead of at target level
- register subinclude statements in the package metadata - filter subincludes label - export all non build_target related statements
…dary build def with sources the updated test uses non-standard child naming to validate new trimming logic
| SubrepoName: subrepo, | ||
| targets: map[string]*BuildTarget{}, | ||
| Outputs: map[string]*BuildTarget{}, | ||
| BuildFileMetadata: newNoopPackageMetadata(), |
There was a problem hiding this comment.
Why do we always use Noop here?
There was a problem hiding this comment.
We default to noop and override if the metadata option is set. Please take a look at the unified NewPackage. Does it make sense?
| // RegisterStatement maps a build statement to target in the package. | ||
| func (pkg *Package) RegisterStatement(target *BuildTarget, stmtProvider BuildStatementProvider) { | ||
| pkg.mutex.Lock() | ||
| defer pkg.mutex.Unlock() |
There was a problem hiding this comment.
Why are you only protecting writes with the mutex, and not reads? Go will panic if there is a concurrent read and write.
tbh, I'd expect any locking to be done inside the BuildFileMetadata implementation (which would avoid paying the cost of locking when we're using the Noop implementation). Better still would be to use concurrency-safe datastructures wherever possible (which most likely use locks under the hood anyway, but might not)
There was a problem hiding this comment.
I was going to justify the use of *Package level methods for reusing the mutex. I initially enriched the AddTarget logic with a BuildStatement, reusing that lock but eventually separated into different methods.
Read is currently only done synchronously, since the export doesn't (yet) support multi threading. I fully agreed and will migrate BuildFileMetadata to concurrent-safe maps. Thanks for raising this.
There was a problem hiding this comment.
Updated. I've used RWMutex instead of dedicated data structures. Mostly because of simplicity and avoiding the performance overhead but also to follow the example of Package and BuildTarget. I can be persuaded in the other direction but I think the parsing is sparse and the operation quick enough that we won't be waiting on the locks that often.
| state *core.BuildState | ||
| targetDir string | ||
|
|
||
| exportedTargets map[*core.Package]map[core.BuildLabel]bool |
There was a problem hiding this comment.
Using a pointer as a key always makes me uncomfortable; can we avoid this?
More generally, could we avoid the nested map? core.BuildLabel includes the package name anyway, right?
There was a problem hiding this comment.
We can probably avoid using the pointer has key by using the package label, but we will have to look up in the graph each time. I was using the pointer directly assuming some consistency of no repeated package instances. Should I use the string instead and lookup in the graph each time we want to use it?
The nested map is useful for looping though the exported target per each package (and for efficient verification of visited targets). What's your opinion, should I try to unnest?
| targetPath := filepath.Join(be.targetDir, file) | ||
| if err := os.RemoveAll(targetPath); err != nil { | ||
| log.Fatalf("failed to remove .plzconfig file %s: %v", file, err) | ||
| } | ||
| if err := fs.CopyFile(file, path, 0); err != nil { | ||
| if err := fs.CopyFile(file, targetPath, 0); err != nil { | ||
| log.Fatalf("failed to copy .plzconfig file %s: %v", file, err) | ||
| } |
There was a problem hiding this comment.
Should we be using fs.RecursiveCopy like we do most places below?
There was a problem hiding this comment.
Why should we? Can .plzconfigs be directories?
| return "" | ||
| } | ||
|
|
||
| sort.Sort(filteredLabels) |
There was a problem hiding this comment.
In the interests of making minimal changes to the export, I think we should remove this sort?
There was a problem hiding this comment.
Since we use a map to register these labels, the order of insertion is not enforced. Sorting ensures that the output is deterministic. We could possibly retain some of the original order by changing some of the logic in PackageMetadata if you consider this important, however, if we keep the formatting I believe it sorts the subincludes.
| return func() *core.BuildStatement { | ||
| stmtScope := s | ||
| for curr := s; curr != nil; curr = curr.callerScope { | ||
| if curr.pkg != nil && curr.filename == s.pkg.Filename { |
There was a problem hiding this comment.
Think we need more of a comment here and more in the doc comment to explain this condition and why we don't break once it's true (which I'm guessing is to handle somebody defining a function in a BUILD file? Do we have an export test case for that? And an export test case for statements inside loops and if-statements?).
If I understand this correctly, we're effectively looking for the highest-level function call which is inside a BUILD file (as opposed to calls within other functions)?
Can we unit-test this function and ActiveSubincludes?
There was a problem hiding this comment.
Agreed, thanks for calling this out. I've added more in depth comments for both methods. e2e test for a function definition in the same package file was also added.
I've added unit test for both methods but I'm not thrilled about them, let me know if you have any ideas on how to improve. I took the approach of building and linking scopes directly, one potential idea could be to define some custom "native_code", inject in the original scope and use that as a callback when parsing a file. Either approaches don't seem great to me.
| if f.nativeCode != nil { | ||
| if f.kwargs { | ||
| return f.callNative(s.NewScope("<builtin code>", 0), c) | ||
| return f.callNative(s.NewScope("", 0), c) |
There was a problem hiding this comment.
Why this change? I think the <builtin code> thing was useful for debugging
There was a problem hiding this comment.
I added that in the previous PR, but since that argument is supposed to be a filename I'm not sure it makes sense. If we attempt to open a file with it, it should fail the same as with using "", I just thought it could be misleading but I'm happy to revert this.
…s. Add fatal to no-op implementation
Enhancements to plz export, moving from a basic target-level trimming (using gc.RewriteFile) to build statement-level trimming, including only the required build rules and subincludes.
For consistency, we format all the exported BUILD files.
Changelog:
ToDo: