Hadoop Sequence File : These are flat files consisting of binary Key-Value Pair. It can store any key-value pair as byte arrays.
Need for Sequential File : Hadoop is meant for processing BigData. It has 64 MB default block size on any cluster. Which mean any file with size lesser than 64 MB will eventually occupy 64 MB physical space on disk storage.
3 Types :
UnCompressed
Record-Compressed
Block-Compressed.
Need for Sequential File : Hadoop is meant for processing BigData. It has 64 MB default block size on any cluster. Which mean any file with size lesser than 64 MB will eventually occupy 64 MB physical space on disk storage.
In practical, Applications deal with files with fewer KB. So, It is advantageous to keep number of such small file in sequential Key-Value pair, which allows programmers to run similar logic on each file found in a block with help of Map-Reduce Jobs.
-org.apache.hadoop.io.SequenceFile Class provides Read, Write methods. It also grants provision for 'Sorting' of SequenceFile Keys.
Thank YOU