So, people who called me names here is a test for you. You need to use python.

- You have 100k CSVs in a folder.
- Read all files in the folder
- Combine them in a single CSV
- Save the combined file for feature engineering using pandas
- All files share the same header
1/4
where do I find 100k CSVs in a folder? Well, in many scenarios and real-life situations. I have made it easy for you: https://github.com/abhishekkrthakur/csv_test

Those who called me names must use pandas.
Those who are willing to learn, scroll below.

2/4
Using pandas, in a simple way, took 120 seconds to do this for me. Using pure python took 5.5 seconds, using pypy took 3.8 seconds. That's why it's important to learn the basics too.

After that, ill use pandas for feature engineering. You don't need a bazooka to kill a fly

3/4
The solution comes in the evening if the people who called me gatekeeper can't post the solution.

P.S. I don't care if you hate my ways, but beginners learn something new all the time and that's what matters to me.

4/4
You can follow @abhi1thakur.
Tip: mention @twtextapp on a Twitter thread with the keyword “unroll” to get a link to it.

Latest Threads Unrolled:

By continuing to use the site, you are consenting to the use of cookies as explained in our Cookie Policy to improve your experience.