One of the reasons for indexing machine learning is to make the community aware of problems. For example, indexing results reveals apples and oranges problems with evaluation. Indexing datasets allows us to tell people which *shouldn't be used* (e.g. retractions, or biases)...
Without indexing you have the problem of: researchers in papers making unfair comparisons in their papers (e.g. misleadingly claiming SoTA), or researchers finding a dataset through Google, e.g. Duke-MTMC, and not being aware it is retracted -> and then using it...
So that's why it's so valuable to make a map of the territory. Once we have a map, we can then put labels on bits and then use our reach to publicize new problems that people have found. Two examples of this below:
This paper on recommender systems revealed "phantom progress" with reported paper results not being reproducible in recommender systems. Imagine if could "scale" amazing papers like this automatically to surface problems like this in every field of ML... https://arxiv.org/abs/1907.06902 
Well that's what our benchmarks and results extraction essentially do at @paperswithcode. We've found *loads* of reproducibility problems across 100 of tasks. We wouldn't know about them unless we first indexed all the paper results!
Similarly now that we've made a map of datasets, we can now systematically be a force in revealing problems like bias, IRB infringements, and so on. We can put a great big STOP sign saying "don't use this dataset!". We've already started doing this: https://twitter.com/adamhrv/status/1356937667997499399?s=20
So indexing isn't just about better discovery of content, it's about discovery of the *nature* of content. We give people context that they may not have in the absence of that indexing solution...
You can follow @rosstaylor90.
Tip: mention @twtextapp on a Twitter thread with the keyword “unroll” to get a link to it.

Latest Threads Unrolled:

By continuing to use the site, you are consenting to the use of cookies as explained in our Cookie Policy to improve your experience.