So, people who called me names here is a test for you. You need to use python.
- You have 100k CSVs in a folder.
- Read all files in the folder
- Combine them in a single CSV
- Save the combined file for feature engineering using pandas
- All files share the same header
1/4
- You have 100k CSVs in a folder.
- Read all files in the folder
- Combine them in a single CSV
- Save the combined file for feature engineering using pandas
- All files share the same header
1/4
where do I find 100k CSVs in a folder? Well, in many scenarios and real-life situations. I have made it easy for you: https://github.com/abhishekkrthakur/csv_test
Those who called me names must use pandas.
Those who are willing to learn, scroll below.
2/4
Those who called me names must use pandas.
Those who are willing to learn, scroll below.
2/4
Using pandas, in a simple way, took 120 seconds to do this for me. Using pure python took 5.5 seconds, using pypy took 3.8 seconds. That's why it's important to learn the basics too.
After that, ill use pandas for feature engineering. You don't need a bazooka to kill a fly
3/4
After that, ill use pandas for feature engineering. You don't need a bazooka to kill a fly
3/4
The solution comes in the evening if the people who called me gatekeeper can't post the solution.
P.S. I don't care if you hate my ways, but beginners learn something new all the time and that's what matters to me.
4/4
P.S. I don't care if you hate my ways, but beginners learn something new all the time and that's what matters to me.
4/4