Hello Experts,
I’m at the moment researching the use of flume to ingest raw xml files, csv and text files. These files are generated from machines from a production line and stored in a share and I intend to use the Spooling Directory source to read them and store in HDFS. As the files have different structures and I dont want to merge them as they’re ingested, I’m thinking about having one set of folder for each file type, SpoolingDirectorySource, Channel and HDFS sink for each file.
The question is, since I suppose I may have more 300 different file types, which means, 300 sources, channels and sinks. Would it have a more practical way to achieve this?
Best regards,
Clóvis