Codecs

Codecs and Wrappers for Digital Video

In the last Greatbear article we quoted sage advice from the International Association of Audiovisual Archivists: ‘Optimal preservation measures are always a compromise between many, often conflicting parameters.’ [1]

While this statement is true in general for many different multi-format collections, the issue of compromise and conflicting parameters becomes especially apparent with the preservation of digitized and born-digital video. The reasons for this are complex, and we shall outline why below.

Lack of standards (or are there too many formats?)

Carl Fleischhauer writes, reflecting on the Federal Agencies Digitization Guidelines Initiative (FADGI) research exploring Digital File Formats for Videotape Reformatting (2014), ‘practices and technology for video reformatting are still emergent, and there are many schools of thought. Beyond the variation in practice, an archive’s choice may also depend on the types of video they wish to reformat.’ [2]

We have written in depth on this blog about the labour intensity of digital information management in relation to reformatting and migration processes (which are of course Greatbear’s bread and butter). We have also discussed how the lack of settled standards tends to make preservation decisions radically provisional.

In contrast, we have written about default standards that have emerged over time through common use and wide adoption, highlighting how parsimonious, non-interventionist approaches may be more practical in the long term.

The problem for those charged with preserving video (as opposed to digital audio or images) is that ‘video, however, is not only relatively more complex but also offers more opportunities for mixing and matching. The various uncompressed-video bitstream encodings, for example, may be wrapped in AVI, QuickTime, Matroska, and MXF.’ [3]

What then, is this ‘mixing and matching’ all about?

It refers to all the possible combinations of bitsteam encodings (‘codecs’) and ‘wrappers’ that are available as target formats for digital video files. Want to mix your JPEG2000 – Lossless with your MXF, or ffv1 with your AVI? Well, go ahead!

What then is the difference between a codec and wrapper?.

As the FADGI report states: ‘Wrappers are distinct from encodings and typically play a different role in a preservation context.’ [4]

The wrapper or ‘file envelope’ stores key information about the technical life or structural properties of the digital object. Such information is essential for long term preservation because it helps to identify, contextualize and outline the significant properties of the digital object.

Information stored in wrappers can include:

  • Content (number of video streams, length of frames),
  • Context (title of object, who created it, description of contents, re-formatting history),
  • Video rendering (Width, Height and Bit-depth, Colour Model within a given Colour Space, Pixel Aspect Ratio, Frame Rate and Compression Type, Compression Ratio and Codec),
  • Audio Rendering – Bit depth and Sample Rate, Bit Rate and compression codec, type of uncompressed sampling.
  • Structure – relationship between audio, video and metadata content. (adapted from the Jisc infokit on High Level Digitisation for Audiovisual Resources)

Codecs, on the other hand, define the parameters of the captured video signal. They are a ‘set of rules which defines how the data is encoded and packaged,’ [5] encompassing Width, Height and Bit-depth, Colour Model within a given Colour Space, Pixel Aspect Ratio and Frame Rate; the bit depth and sample rate and bit rate of the audio.

Although the wrapper is distinct from the encoded file, the encoded file cannot be read without its wrapper. The digital video file, then, comprises of wrapper and at least one codec, often two, to account for audio and images, as this illustration from AV Preserve makes clear.

Codecs and Wrappers

Diagram taken from AV Preserve’s A Primer on Codecs for Moving Image and Sound Archives

Pick and mix complexity

Why then, are there so many possible combinations of wrappers and codecs for video files, and why has a settled standard not been agreed upon?

Fleischhauer at The Signal does an excellent job outlining the different preferences within practitioner communities, in particular relating to the adoption of ‘open’ and commercial/ proprietary formats.

Compellingly, he articulates a geopolitical divergence between these two camps, with those based in the US allegedly opting for commercial formats, and those in Europe opting for ‘open.’ This observation is all the more surprising because of the advice in FADGI’s Creating and Archiving Born Digital Video: ‘choose formats that are open and non-proprietary. Non-proprietary formats are less likely to change dramatically without user input, be pulled from the marketplace or have patent or licensing restrictions.’ [6]

One answer to the question: why so many different formats can be explained by different approaches to information management in this information-driven economy. The combination of competition and innovation results in a proliferation of open source and their proprietary doubles (or triplets, quadruples, etc) that are constantly evolving in response to market ‘demand’.

Impact of the Broadcast Industry

An important area to highlight driving change in this area is the role of the broadcast industry.

Format selections in this sector have a profound impact on the creation of digital video files that will later become digital archive objects.

In the world of video, Kummer et al explain in an article in the IASA journal, ‘a codec’s suitability for use in production often dictates the chosen archive format, especially for public broadcasting companies who, by their very nature, focus on the level of productivity of the archive.’ [7] Broadcast production companies create content that needs to be able to be retrieved, often in targeted segments, with ease and accuracy. They approach the creation of digital video objects differently to how an archivist would, who would be concerned with maintaining file integrity rather ensuring the source material’s productivity.

Furthermore, production contexts in the broadcast world have a very short life span: ‘a sustainable archiving decision will have to made again in ten years’ time, since the life cycle of a production system tends to be between 3 and 5 years, and the production formats prevalent at that time may well be different to those in use now.’ [8]

Take, for example, H.264/ AVC ‘by far the most ubiquitous video coding standard to date. It will remain so probably until 2015 when volume production and infrastructure changes enable a major shift to H.265/ HEVC […] H.264/ AVC has played a key role in enabling internet video, mobile services, OTT services, IPTV and HDTV. H.264/ AVC is a mandatory format for Blu-ray players and is used by most internet streaming sites including Vimeo, youtube and iTunes. It is also used in Adobe Flash Player and Microsoft Silverlight and it has also been adopted for HDTV cable, satellite, and terrestrial broadcasting,’ writes David Bull in his book Communicating Pictures.

HEVC, which is ‘poised to make a major impact on the video industry […] offers to the potential for up to 50% compression efficiency improvement over AVC.’ Furthermore, HEVC has a ‘specific focus on bit rate reduction for increased video resolutions and on support for parallel processing as well as loss resilience and ease if integration with appropriate transport mechanisms.’ [9]

CODEC Quality Chart3Increased compression

The development of codecs for use in the broadcast industry deploy increasingly sophisticated compression that reduce bit rate but retain image quality. As AV Preserve explain in their codec primer paper, ‘we can think of compression as a second encoding process, taking coded information and transferring or constraining it to a different, generally more efficient code.’ [10]

The explosion of mobile, video data in the current media moment is one of the main reasons why sophisticated compression codecs are being developed. This should not pose any particular problems for the audiovisual archivist per se—if a file is ‘born’ with high degrees of compression the authenticity of the file should not ideally, be compromised in subsequent migrations.

Nevertheless, the influence of the broadcast industry tells us a lot about the types of files that will be entering the archive in the next 10-20 years. On a perceptual level, we might note an endearing irony: the rise of super HD and ultra HD goes hand in hand with increased compression applied to the captured signal. While compression cannot, necessarily, be understood as a simple ‘taking away’ of data, its increased use in ubiquitous media environments underlines how the perception of high definition is engineered in very specific ways, and this engineering does not automatically correlate with capturing more, or better quality, data.

Like error correction that we have discussed elsewhere on the blog, it is often the anticipation of malfunction that is factored into the design of digital media objects. These, in turn, create the impression of smooth, continuous playback—despite the chaos operating under the surface. The greater clarity of the visual image, the more the signal has been squeezed and manipulated so that it can be transmitted with speed and accuracy. [11]

MXF

Staying with the broadcast world, we will finish this article by focussing on the MXF wrapper that was ‘specifically designed to aid interoperability and interchange between different vendor systems, especially within the media and entertainment production communities. [MXF] allows different variations of files to be created for specific production environments and can act as a wrapper for metadata & other types of associated data including complex timecode, closed captions and multiple audio tracks.’ [12]

The Presto Centre’s latest TechWatch report (December 2014) asserts ‘it is very rare to meet a workflow provider that isn’t committed to using MXF,’ making it ‘the exchange format of choice.’ [13]MXF

We can see such adoption in action with the Digital Production Partnership’s AS-11 standard, which came into operation October 2014 to streamline digital file-based workflows in the UK broadcast industry.

While the FADGI reports highlights the instability of archival practices for video, the Presto Centre argue that practices are ‘currently in a state of evolution rather than revolution, and that changes are arriving step-by-step rather than with new technologies.’

They also highlight the key role of the broadcast industry as future archival ‘content producers,’ and the necessity of developing technical processes that can be complimentary for both sectors: ‘we need to look towards a world where archiving is more closely coupled to the content production process, rather than being a post-process, and this is something that is not yet being considered.’ [14]

The world of archiving and reformatting digital video is undoubtedly complex. As the quote used at the beginning of the article states, any decision can only ever be a compromise that takes into account organizational capacities and available resources.

What is positive is the amount of research openly available that can empower people with the basics, or help them to delve into the technical depths of codecs and wrappers if so desired. We hope this article will give you access to many of the interesting resources available and some key issues.

As ever, if you have a video digitization project you need to discuss, contact us—we are happy to help!

References:

[1] IASA Technical Committee (2014) Handling and Storage of Audio and Video Carriers, 6. 

[2] Carl Fleischhauer, ‘Comparing Formats for Video Digitization.’ http://blogs.loc.gov/digitalpreservation/2014/12/comparing-formats-for-video-digitization/.

[3] Federal Agencies Digital Guidelines Initiative (FADGI), Digital File Formats for Videotape Reformatting Part 5. Narrative and Summary Tables. http://www.digitizationguidelines.gov/guidelines/FADGI_VideoReFormatCompare_pt5_20141202.pdf, 4.

[4] FADGI, Digital File Formats for Videotape, 4.

[5] AV Preserve (2010) A Primer on Codecs for Moving Image and Sound Archives & 10 Recommendations for Codec Selection and Managementwww.avpreserve.com/wp-content/…/04/AVPS_Codec_Primer.pdf, 1.

‎[6] FADGI (2014) Creating and Archiving Born Digital Video Part III. High Level Recommended Practices, http://www.digitizationguidelines.gov/guidelines/FADGI_BDV_p3_20141202.pdf, 24.
[7] Jean-Christophe Kummer, Peter Kuhnle and Sebastian Gabler (2015) ‘Broadcast Archives: Between Productivity and Preservation’, IASA Journal, vol 44, 35.

[8] Kummer et al, ‘Broadcast Archives: Between Productivity and Preservation,’ 38.

[9] David Bull (2014) Communicating Pictures, Academic Press, 435-437.

[10] Av Preserve, A Primer on Codecs for Moving Image and Sound Archives, 2.

[11] For more reflections on compression, check out this fascinating talk from software theorist Alexander Galloway. The more practically bent can download and play with VISTRA, a video compression demonstrator developed at the University of Bristol ‘which provides an interactive overview of the some of the key principles of image and video compression.

[12] ‘FADGI, Digital File Formats for Videotape, 11.

[13] Presto Centre, AV Digitisation and Digital Preservation TechWatch Report #3, https://www.prestocentre.org/, 9.

[14] Presto Centre, AV Digitisation and Digital Preservation TechWatch Report #3, 10-11.

Posted by debra in digitisation expertise, video tape, 1 comment

Open Source Solutions for Digital Preservation

In a technological world that is rapidly changing how can digital information remain accessible?

One answer to this question lies in the use of open source technologies. As a digital preservation strategy it makes little sense to use codecs owned by Mac or Windows to save data in the long term. Propriety software essentially operate like closed systems and risk compromising access to data in years to come.

Linux Operating System

It is vital, therefore, that the digitisation work we do at Great Bear is done within the wider context of digital preservation. This means making informed decisions about the hardware and software we use to migrate your tape-based media into digital formats. We use a mixture of propriety and open source software, simply because it makes our a bit life easier. Customers also ask us to deliver their files in propriety formats. For example, Apple pro res is a really popular codec that doesn’t take up a lot of data space so our customers often request this, and of course we are happy to provide it.

Using open systems definitely has benefits. The flexibility of Linux, for example, enables us to customise our digitisation system according to what we need to do. As with the rest of our work, we are keen to find ways to keep using old technologies if they work well, rather than simply throwing things away when shiny new devices come on the market. There is the misconception that to ingest vast amounts of audio data you need the latest hardware. All you need in fact is a big hard drive, flexible, yet reliable, software and an operating system that doesn’t crash so it can be left to ingest for 8 hours or more. Simple! Examples of open source software we use is the sound processing programme SoX. This saves us a lot of time because we are able to write scripts for the programme that can be used to batch process audio data according to project specifications.

Openness in the digital preservation world

Within the wider digital preservation world open source technologies are also used widely. From digital preservation tools developed by projects such as SCAPE and the Open Planets Foundation, there are plenty of software resources available for individuals and organisations who need to manage their digital assets. It would be naïve, however, to assume that the practice of openness here, and in other realms of the information economy, are born from the same techno-utopian impulse that propelled the open software movement from the 1970s onwards. The SCAPE website makes it clear that the development of open source information preservation tools are ‘the best approach given the substantial public investment made at the European and national levels, and because it is the most effective way to encourage commercial growth.’

What would make projects like SCAPE and Open Planets even better is if they thought about ways to engage non-specialist users who may be curious about digital preservation tools but have little experience of navigating complex software. The tools may well be open, but the knowledge of how to use them are not.

Openness, as a means of widening access to technical skills and knowledge, is the impulse behind the AV Artifact Atlas (AVAA), an initiative developed in conjunction with the community media archive project Bay Area Video Coalition. In a recent interview on the Library of Congress’ Digital Preservation Blog, Hannah Frost, Digital Library Services Manager at Stanford Libraries and Manager, Stanford Media Preservation Lab explains the idea behind the AVAA.

‘The problem is most archivists, curators and conservators involved in media reformatting are ill-equipped to detect artifacts, or further still to understand their cause and ensure a high quality job. They typically don’t have deep training or practical experience working with legacy media. After all, why should we? This knowledge is by and large the expertise of video and audio engineers and is increasingly rare as the analogue generation ages, retires and passes on. Over the years, engineers sometimes have used different words or imprecise language to describe the same thing, making the technical terminology even more intimidating or inaccessible to the uninitiated. We need a way capture and codify this information into something broadly useful. Preserving archival audiovisual media is a major challenge facing libraries, archives and museums today and it will challenge us for some time. We need all the legs up we can get.’

The promise of openness can be a fraught terrain. In some respects we are caught between a hyper-networked reality, where ideas, information and tools are shared openly at a lightning pace. There is the expectation that we can have whatever we want, when we want it, which is usually now. On the other side of openness are questions of ownership and regulation – who controls information, and to what ends?

Perhaps the emphasis placed on the value of information within this context will ultimately benefit digital archives, because there will be significant investment, as there already has been, in the development of open resources that will help to take care of digital information in the long term.

Posted by debra in audio tape, digitisation expertise, video tape, 0 comments