Move Files from One Folder or Directory to Another using Apache NiFi

In this post, we will read the files from one local folder and move it to another using NiFi. This demo is also applicable to copying files from one directory to another.

Apache NiFi is a software project from the Apache Software Foundation designed to automate the flow of data between software systems. Apache NiFi supports powerful and scalable directed graphs of data routing, transformation, and system mediation logic.


Move Files from One Folder to Another

I have created two folders on my local system with the name Input and Output. I will be keeping multiple files in the Input folder.

I will be moving any files with any extension not specific onto one extension.

Go to NiFi Web UI and drag and drop a new processor on the workspace.

Search for GetFile processor and click on Add.

Currently, the processor will be showing an error because there are compulsory properties that needs to be set. Double click on the GetFile processor or Right click on it and click on Configure.

Navigate to the Properties tab.

Enter the Input Directory where your files are kept. If you want to pick only the specific files using a pattern, you can change the File Filter property.

The Batch Size tells, how many files to pick up in a single batch. If you want to keep the files at the source location (in our case Input folder) then set the property Keep Source File to true. This will copy files instead of moving them.

I have just changed the Input Directory property and kept all the others as default. After setting the property, click on Apply.

You will still see the error on the processor because we have not connected this processor to the next one. Drag a new processor and search for PutFile.

Connect both the processors by dragging from GetFile to PutFile. Create the relationship for success.

This is how it will look like after we connected the processors. Now the GetFile is not showing the error. The State is Stopped.

Next, we need to configure the PutFile processor. Double click on it or Right click and select Configure. Go to Properties tab and provide the Directory name. This is the directory where the files will be moved to. Leave other properties to default unless you have any specific to change.

After you have set the properties, go to the Settings tab. Check the Auto terminate relationship on failure and success. I am doing this because I will not have any processor to connect after this PutFile. This is my last processor. For complex flows, the success relationship is created and flow file is passed to the next processor.

Apply to make the changes. Below is what the final flow looks like. Both are in stopped state.

For better understanding, we will be running the processor one by one. Right click on GetFile and select Start. If you see any changes to the Queued items, you can stop the GetFile processor.

I had kept 4 files in the Input folder. You can check the file names, size and other metadata by right clicking on the Queued items and select List Queue.

Here you can see the file names and its size. To see more attributes related to the file, click on the info icon.

You can also see the content of the file and can download it as well.

After you have done analyzing the details and the attributes of the file, close this window and start the PutFile processor.

This moves the files from one location to another. Check the folders manually if it worked.

Thank you All!!! Hope you find this useful.

Leave a Reply

Up ↑