Reading from Files in Java: Common Scenarios and Approaches

string array javaThe ability to read contents from the local file system has been available since the very first versions of Java. This ability is standardized and included into the java.io package. We are going to describe some common scenarios for reading from files using Java.

1. Load All File Content at Once

In this kind of file read operation, no special meaning is attributed to any portion of the data while it is read. The focus is to have the entire file content loaded into application memory, usually as an array of bytes. The data is further processed using other packages and libraries once all the file content are finished loading.

A good example of such a file load operation is to read the data from a PNG or JPG file and then send the byte array to the ImageIO class to render an image on the screen.

Learn how to work with file I/O in Java at Udemy.com

The following code snippet can be used to load all file content into memory:

File f = new File("input.dat");

//the input.dat is the file to load into memory.

FileInputStreamfis = new FileInputStream(fis);

ByteArrayOutputStream buffer = new ByteArrayOutputStream((int) f.length());

byte[] block = new byte[4096];

intreadAmount = 0;

while((readAmount = fis.read(block) >= 0) {

if(readAmount> 0) {

buffer.write(block, 0, readAmount);

}

}

buffer.flush();

buffer.close();

fis.close();

byte[] fileData = buffer.toByteArray();

//fileData is the final location of the file content in memory.

The byte array output stream serves as an in-memory buffer that can be dynamically expanded to contain more data as it gets read from the file. It is pre-allocated to the size of the file using the constructor parameter. This removes the overhead of heap space re-allocation and movement that would happen each time the buffer within the stream is expanded to hold more data.

The while loop is used to read data from the file in 4k blocks. You can change this to any other size in order to reduce the number of read operations performed. The output of the read operation is a count of the number of bytes actually read into the 4k block. Based on this value, the corresponding subset of the block data is written into the buffer instance.

If the ‘read by block’ operation returns a value that is less than 0, it indicates that the end of file (EOF) condition has been reached and that no further data is available for reading. All streams are then closed and the data is extracted from the buffer instance into an array of bytes.

Learn Java programming from scratch by taking a course at Udemy.com

2. Line-by-Line Loading of File Content

This scenario assumes that you are reading content that is primarily textual. The textual content is treated as a collection of lines. Each line is terminated by a combination of carriage return (CR) and line feed (LF) characters, depending on the underlying operating system. In Windows, line endings use CR/LF (“\r\n”) while Linux uses just LF (“\n”).

The following code snippet can be used to load the contents of a file into an array of strings in memory with each string representing a line in the input file.

ArrayList<String> lines = new ArrayList<String>();

File f = new File("input.txt");

//the input.txt is the file to load into memory.

BufferedReader reader = new BufferedReader(new FileReader(f));

String oneLine = null;

while((oneLine = reader.readLine()) != null) {

if(oneLine.trim().length() > 0) {

lines.add(oneLine);

}

}

reader.close();

String[] allLines = lines.toArray(new String[lines.size()]);

//allLines is the final collection of lines that constitute the content of the file.

A BufferedReader class example is used above. This class provides a convenient readLine() method to read one line of content from the underlying reader. If the operation returns a null string, it indicates that the end of file (EOF) has been reached and that no further content is available for reading.

Conversion from a list of strings into an array of strings is provided in case the output is expected to be of one form rather than the other. Therefore, the last non-comment line in the above snippet is optional.

3. Load Structured Data from a File

A good example of structured data is line-by-line entries in the input file. Each line is a key-value pair separated by the ‘=’ character. The key-values are populated into a data structure that exposes a java.util.Map interface. Every pair is extracted. A line is read from the file, parsed and populated in the map before moving on to the next line.

The following code snippet provides an example of how key-value pairs are loaded from a file into memory.

File f = new File("input.txt");

//the input.txt is the file to load into memory into memory as key-value pairs.
FileInputStreamfis = new FileInputStream(f); 
Properties keyValuePairs = new Properties(); 
keyValuePairs.load(fis); 
fis.close();

Note the use of the Properties class. This class is available as part of Java’s core libraries under the package java.util. The Properties class contains methods to read a file one line at a time, and then parsing the line to create key-value pairs. The Properties object also implements the familiar map object that allows access to individual key-values by key names.

4. Loading a Set of Records from File.

In this scenario, it’s assumed that a single file contains a collection of records where each record is a collection of bytes of variable length. Further information can be extracted on a per record basis. However, it is done only after the record has been loaded into application memory.

To write records to a file, the following approach is taken:

  1. For each record, 4 bytes of data are first written into the file. These 4 bytes hold an integer value that represents the number of bytes in the record.
  2. This is followed by an array of bytes that represents the content of the record.
  3. Repeat step #1 and write the next record.

Learn advanced Java programming by taking a course at Udemy.com

The following code snippet shows how the records can be read one at a time from a file:

public class RecordReader { 

privateDataInputStream dis; 

publicRecordReader(InputStream in) {

dis = new DataInputStream(in);

}

public byte[] read() {

try {

int length = dis.readInt();

byte[] record = new byte[length];

dis.readFully(record);

return record;

}

catch(Exception exep) {

return null;

}

}

}

And then in another method, you can do the following:

File f = new File("input.txt");

//the input.txt is the file whose contents are to be loaded

//into memory as variable length records.

FileInputStreamfis = new FileInputStream(f);

RecordReaderrr = new RecordReader();

byte[] record = null;

while((record = rr.read()) != null) {

//convert the record byte array into a Record object.

}

fis.close();

The main read logic is in the read() method of the RecordReader class. If the method returns the value as null, it means that the end of file has been reached and no further data is available.