1. Files, Directories and File Access

Despite everything moving to the cloud and databases becoming all the rage, the good old file system is still relevant for storing and organizing important documents. Even tech-savvy folks like Captain Bonny Brain and Captain CiaoCiao still rely heavily on local storage. There are certain things that just shouldn’t see the light of day, you know?

Prerequisites

  • know the File class, Path interface and Files class in the basics

  • be able to create temporary files

  • have the capability to extract metafiles from both files and directories

  • be able to list and filter directory contents

  • read and write complete files

  • know RandomAccessFile

Data types used in this chapter:

1.1. Path and Files

As with so many things in Java, there is the "old" way and the "new" way when it comes to file processing. In many examples, you still see code with the types java.io.File, FileInputStream, FileOutputStream, FileReader and FileWriter. These types are no longer up-to-date, so in this chapter we will deal exclusively with Path and Files because these types allow the use of virtual file systems, like a ZIP archive. File is now only required when actually dealing with files or directories of the local file system; examples would be opening files with the programs associated by the operating system or redirecting data streams from externally started programs.

1.1.1. Display saying of the day ⭐

Sometimes Captain CiaoCiao can’t quite get motivated. A motivational or sense saying for the day brings the grumpy head to new thoughts. The task is to program an application that generates an HTML file with a saying and then opens the browser to display this text. The exercise can be solved with two methods of java.nio.files.Files.

Exercise:

  • Create a temporary file ending with the file suffix .html using an appropriate files method.

  • Write HTML code in the new temporary file, such as the following:

    <!DOCTYPE html><html><body>
    'The things we steal tell us who we are.'
    - Thomas of Tew
    </body></html>
  • Find a method from the java.awt.Desktop class that opens the default browser and displays the HTML file as a result.

1.1.2. Merge hiding places ⭐

With certain Files methods, an entire file can be read and rewritten line by line in a single step.

Captain CiaoCiao collects potential hiding places in a large text file. But often he spontaneously thinks of more hiding places and quickly writes them into a new file. Now he takes his time and cleans up and merges everything; the small text files are to be merged with the large file. It is important that the order of the entries in the large file is not changed. Only the entries from the small files are included if they do not appear in the large file because it may be that the main file already contains the hiding places.

Exercise:

  • Write a method mergeFiles(Path main, Path... temp) that opens the master file, adds all temporary contents, and then writes back the master file.

1.1.3. Create copies of a file ⭐⭐

If you copy a file to the same folder in Windows Explorer, a copy is created. This copy automatically gets a new name. We are looking for a Java program that replicates this behavior.

Exercise:

  • Write a Java method cloneFile(Path path) that creates copies of files, generating the file names systematically. Suppose <name> symbolizes the file name, then the first copy will be Copy of <name> and thereafter the file names should be Copy (<number>) of <name>.

  • If you call the methods on directories or there are other errors, the method may throw an IOException.

Example:

  • Suppose a file is called Top Secret UFO Files.txt. Then the new file names should look like this:

    • Copy of Top Secret UFO Files.txt

    • Copy (2) of from Top Secret UFO Files.txt

    • Copy (3) of from Top Secret UFO Files.txt

    • etc.

1.1.4. Generate a directory listing ⭐

On the command line, the user can display directory contents and metadata, just as a file selection dialog displays files to the user.

Exercise:

  • Using Files and the newDirectoryStream(…​) method, write a program that lists the directory contents for the current directory.

  • Call the dir program under DOS. Rebuild the output of the directory listing completely. The header and footer are not necessary.

1.1.5. Search for large GIF files ⭐

There’s a mess on Bonny Brain’s hard drive, partly because she stores all her images in exactly one directory. Now the pictures from the last treasure hunt are untraceable! All she remembers is that the images were saved in GIF format, and they were over 1024 pixels wide.

Exercise:

  • Given is any directory. Search in this directory (not recursively!) for all images that are of type GIF and have a minimum width of 1024 pixels.

Access the following code to read the widths and perform GIF checking:

private static final byte[] GIF87aGIF89a = "GIF87aGIF89a".getBytes();
private static boolean isGifAndWidthGreaterThan1024( Path entry ) {
  if ( ! Files.isRegularFile( entry ) || ! Files.isReadable( entry ) )
    return false;

  try ( RandomAccessFile raf = new RandomAccessFile( entry.toFile(), "r" ) ) {
    byte[] bytes = new byte[ 8 ];
    raf.read( bytes );

    if ( ! Arrays.equals( bytes, 0, 6, GIF87aGIF89a, 0, 6 ) &&
         ! Arrays.equals( bytes, 0, 6, GIF87aGIF89a, 6, 12 ) )
      return false;

    int width = bytes[ 6 ] + (bytes[ 7 ] << 8);
    return width > 1024;
  }
  catch ( IOException e ) {
    throw new UncheckedIOException( e );
  }
}

The method reads the first bytes and checks if the first 6 bytes match either the string GIF87a or GIF89a. In principle, this test can also be implemented with ! new String(bytes, 0, 6).matches("GIF87a|GIF89a"), but that would cause some temporary objects in memory.

After the check, the program reads 2 bytes for the width and converts the bytes to a 16-bit integer.

1.1.6. Descend directories recursively and find empty text files ⭐

There is still a big mess on Bonny Brain’s hard drive. For some inexplicable reason, it has many text files with 0 bytes.

Exercise:

  • Using a FileVisitor, run recursively from a chosen starting directory through all subdirectories, looking for empty text files.

  • Text files are files that have the file extension .txt (case-insensitive).

  • If found, show the absolute path of the file on the console.

1.1.7. Develop your own utility library for file filters ⭐⭐⭐

The Files class provides three static methods to query all entries in a directory:

  • newDirectoryStream(Path dir)

  • newDirectoryStream(Path dir, String glob)

  • newDirectoryStream(Path dir, DirectoryStream.Filter<? super Path> filter)

The result is always a DirectoryStream<Path>. The first method does not filter the results, the second method allows a glob string such as *.txt, and the third method allows any filter.

java.nio.file.DirectoryStream.Filter<T> is an interface that filters must implement. The method is boolean accept(T entry) and is like a predicate.

The Java library declares the interface, but no implementation.

Exercise:

  • Write various implementations of DirectoryStream.Filter that can check files for

    • attributes (like readable, writable)

    • the length

    • the file extensions

    • the filename via regular expressions

    • magic initial identifiers

Ideally, the API allows all filters to be concatenated, something like this:

DirectoryStream.Filter<Path> filter =
    regularFile.and( readable )
               .and( largerThan( 100_000 ) )
               .and( magicNumber( 0x89, 'P', 'N', 'G' ) )
               .and( globMatches( "*.png" ) )
               .and( regexContains( "[-]" ) );

try ( DirectoryStream<Path> entries =Files.newDirectoryStream( dir, filter ) ) {
  entries.forEach( System.out::println );
}

1.2. Random access to file contents

For files, an input/output stream can be obtained and read or written from beginning to end. Another API allows random access, i.e., a position pointer.

1.2.1. Output last line of a text file ⭐⭐

Crew members write all actions in an electronic logbook, with new entries appended at the end. No entry is longer than 100 characters, the texts are written in UTF-8.

Now Captain CiaoCiao is interested in the last entry. What does a Java program look like if only the last line is to be read from a file? Since there are already plenty of entries in the log, it is not possible to read the file completely.

Exercise:

  • Write a program that returns the last line of a text file.

  • Find a solution that does not need unnecessary memory.

Consider whether ([^\r\n]*)$ can be used in a meaningful way.