Name is required.
Email address is required.
Invalid email address
Answer is required.
Exceeding max length of 5KB

Decide order to iterate files

Im iterating through a bunch of files in Google Storage. This works fine but I would like to do it in a certain order. The file names contain dates and I would like to read the oldest file first. Any suggestions on how to do this?

2 Community Answers

Matillion Agent  

Kalyan Arangam —

Hi Cris,

The file-iterator itself seems to return filenames alphabetically. If your files are named appropriately (prefix_yyyymmdd), then it may return files in the order you need.
Add a file-iterator to your job and configure it to your folder and map a variable to Filename. Attach a python script component to it and print the filename.

What order do you see?

Another approach is to pull the filenames into a table in bigquery and iterate over them (Table/Grid iterator) using a sort order.

You may also read list of filenames from a Python component using boto library. Sort the filenames as necessary and take some action.

Hope that gives certain pointers on what’s possible.


Cristian Ivanoff —

Thanks Kalyan! I will try your suggested approches...

Post Your Community Answer

To add an answer please login