Misplaced Pages

Digitization

Article snapshot taken from Wikipedia with creative commons attribution-sharealike license. Give it a read and then ask your questions in the chat. We can research this topic together.
#960039

66-416: Digitization is the process of converting information into a digital (i.e. computer-readable) format. The result is the representation of an object, image , sound , document , or signal (usually an analog signal ) obtained by generating a series of numbers that describe a discrete set of points or samples . The result is called digital representation or, more specifically, a digital image , for

132-465: A signal , thus which keys are pressed. When the keyboard processor detects that a key has changed state, it sends a signal to the CPU indicating the scan code of the key and its new state. The symbol is then encoded or converted into a number based on the status of modifier keys and the desired character encoding . A custom encoding can be used for a specific application with no loss of data. However, using

198-441: A whiffletree mechanism to produce finer steps. The IBM Selectric typewriter uses such a system. DACs are widely used in modern communication systems enabling the generation of digitally-defined transmission signals. High-speed DACs are used for mobile communications and ultra-high-speed DACs are employed in optical communications systems. The most common types of electronic DACs are: The most important characteristics of

264-402: A DAC are: Other measurements, such as phase distortion and jitter , can also be very important for some applications, some of which (e.g. wireless data transmission, composite video) may even rely on accurate production of phase-adjusted signals. Non-linear PCM encodings (A-law / μ-law, ADPCM, NICAM) attempt to improve their effective dynamic ranges by using logarithmic step sizes between

330-617: A Microsoft Word document or a social media post. In contrast, digitization only applies exclusively to analog materials. Born-digital materials present a unique challenge to digital preservation not only due to technological obsolescence but also because of the inherently unstable nature of digital storage and maintenance. Most websites last between 2.5 and 5 years, depending on the purpose for which they were designed. The Library of Congress provides numerous resources and tips for individuals looking to practice digitization and digital preservation for their personal collections. Digital reformatting

396-629: A budget needs more money to cover the cost of equipment or staff, an institution might investigate if grants are available. Collaborations between institutions have the potential to save money on equipment, staff, and training as individual members share their equipment, manpower, and skills rather than pay outside organizations to provide these services. Collaborations with donors can build long-term support of current and future digitization projects. Outsourcing can be an option if an institution does not want to invest in equipment but since most vendors require an inventory and basic metadata for materials, this

462-528: A continually varying physical signal . Provided that a signal's bandwidth meets the requirements of the Nyquist–;Shannon sampling theorem (i.e., a baseband signal with bandwidth less than the Nyquist frequency ) and was sampled with infinite resolution, the original signal can theoretically be reconstructed from the sampled data. However, an ADC's filtering can't entirely eliminate all frequencies above

528-458: A group of switches that are polled at regular intervals to see which switches are switched. Data will be lost if, within a single polling interval, two switches are pressed, or a switch is pressed, released, and pressed again. This polling can be done by a specialized processor in the device to prevent burdening the main CPU . When a new symbol has been entered, the device typically sends an interrupt , in

594-677: A linear contrast ratio (difference between darkest and brightest output levels) of 1000:1 or greater, equivalent to 10 bits of audio precision even though it may only accept signals with 8-bit precision and use an LCD panel that only represents 6 or 7 bits per channel. Video signals from a digital source, such as a computer, must be converted to analog form if they are to be displayed on an analog monitor. As of 2007, analog inputs were more commonly used than digital, but this changed as flat-panel displays with DVI and/or HDMI connections became more widespread. A video DAC is, however, incorporated in any digital video player with analog outputs. The DAC

660-468: A packaged IC) would typically be extremely high-speed low-resolution power-hungry types, as used in military radar systems. Very high-speed test equipment, especially sampling oscilloscopes , may also use discrete DACs. A DAC converts an abstract finite-precision number (usually a fixed-point binary number ) into a physical quantity (e.g., a voltage or a pressure ). In particular, DACs are often used to convert finite-precision time series data to

726-435: A point where a book is completely unusable. In theory, if these widely circulated titles are not treated with de-acidification processes, the materials upon those acid pages will be lost. As digital technology evolves, it is increasingly preferred as a method of preserving these materials, mainly because it can provide easier access points and significantly reduce the need for physical storage space. Cambridge University Library

SECTION 10

#1732859156961

792-448: A specialized format, so that the CPU can read it. For devices with only a few switches (such as the buttons on a joystick ), the status of each can be encoded as bits (usually 0 for released and 1 for pressed) in a single word. This is useful when combinations of key presses are meaningful, and is sometimes used for passing the status of modifier keys on a keyboard (such as shift and control). But it does not scale to support more keys than

858-424: A standard encoding such as ASCII is problematic if a symbol such as 'ß' needs to be converted but is not in the standard. It is estimated that in the year 1986, less than 1% of the world's technological capacity to store information was digital and in 2007 it was already 94%. The year 2002 is assumed to be the year when humankind was able to store more information in digital than in analog format (the "beginning of

924-628: A video tape player to be connected to a computer while the item plays in real time. Slides can be digitized quicker with a slide scanner such as the Nikon Coolscan 5000ED. Another example of digitization is the VisualAudio process developed by the Swiss Fonoteca Nazionale in Lugano , by scanning a high resolution photograph of a record, they are able to extract and reconstruct the sound from

990-450: Is binary data , which is represented by a string of binary digits (bits) each of which can have one of two values, either 0 or 1. Digital data can be contrasted with analog data , which is represented by a value from a continuous range of real numbers . Analog data is transmitted by an analog signal , which not only takes on continuous values but can vary continuously with time, a continuous real-valued function of time. An example

1056-426: Is central to making digital representations of geographical features, using raster or vector images, in a geographic information system , i.e., the creation of electronic maps , either from various geographical and satellite imaging (raster) or by digitizing traditional paper maps or graphs (vector). "Digitization" is also used to describe the process of populating databases with files or data. While this usage

1122-632: Is done once with the technology currently available, while digital preservation is more complicated because technology changes so quickly that a once popular storage format may become obsolete before it breaks. An example is a 5 1/4" floppy drive, computers are no longer made with them and obtaining the hardware to convert a file stored on 5 1/4" floppy disc can be expensive. To combat this risk, equipment must be upgraded as newer technology becomes affordable (about 2 to 5 years), but before older technology becomes unobtainable (about 5 to 10 years). Digital preservation can also apply to born-digital material, such as

1188-500: Is fed into an embroidery machine and applied to the fabric. The most supported format is DST file. Apparel companies also digitize clothing patterns. Analog signals are continuous electrical signals; digital signals are non-continuous. Analog signals can be converted to digital signals by using an analog-to-digital converter . The process of converting analog to digital consists of two parts: sampling and quantizing. Sampling measures wave amplitudes at regular intervals, splits them along

1254-551: Is most commonly used in computing and electronics , especially where real-world information is converted to binary numeric form as in digital audio and digital photography . Since symbols (for example, alphanumeric characters ) are not continuous, representing symbols digitally is rather simpler than conversion of continuous or analog information to digital. Instead of sampling and quantization as in analog-to-digital conversion , such techniques as polling and encoding are used. A symbol input device usually consists of

1320-497: Is not an option for institutions hoping to digitize without processing. Digital data Digital data , in information theory and information systems , is information represented as a string of discrete symbols, each of which can take on one of only a finite number of values from some alphabet , such as letters or digits. An example is a text document , which consists of a string of alphanumeric characters . The most common form of digital data in modern information systems

1386-408: Is often described as converting it from analog to digital, however both copies remain. An example would be scanning a photograph and having the original piece in a photo album and a digital copy saved to a computer. This is essentially the first step in digital preservation which is to maintain the digital copy over a long period of time and making sure it remains authentic and accessible. Digitization

SECTION 20

#1732859156961

1452-648: Is still used by institutions such as the National Archives and Records Administration ( NARA ) to provide preservation and access to these resources. While digital versions of analog texts can potentially be accessed from anywhere in the world, they are not as stable as most print materials or manuscripts and are unlikely to be accessible decades from now without further preservation efforts, while many books manuscripts and scrolls have already been around for centuries. However, for some materials that have been damaged by water, insects, or catastrophes, digitization might be

1518-441: Is technically inaccurate, it originates with the previously proper use of the term to describe that part of the process involving digitization of analog sources, such as printed pictures and brochures, before uploading to target databases. Digitizing may also be used in the field of apparel, where an image may be recreated with the help of embroidery digitizing software tools and saved as embroidery machine code. This machine code

1584-464: Is the air pressure variation in a sound wave . The word digital comes from the same source as the words digit and digitus (the Latin word for finger ), as fingers are often used for counting. Mathematician George Stibitz of Bell Telephone Laboratories used the word digital in reference to the fast electric pulses emitted by a device designed to aim and fire anti-aircraft guns in 1942. The term

1650-405: Is the primary way of storing images in a form suitable for transmission and computer processing, whether scanned from two-dimensional analog originals or captured using an image sensor -equipped device such as a digital camera , tomographical instrument such as a CAT scanner , or acquiring precise dimensions from a real-world object, such as a car , using a 3D scanning device. Digitizing

1716-517: Is the process of converting analog materials into a digital format as a surrogate of the original. The digital surrogates perform a preservation function by reducing or eliminating the use of the original. Digital reformatting is guided by established best practices to ensure that materials are being converted at the highest quality. The Library of Congress has been actively reformatting materials for its American Memory project and developed best standards and practices pertaining to book handling during

1782-403: Is then reconstructed into analog using a DAC on the receiving party's end. Video sampling tends to work on a completely different scale altogether thanks to the highly nonlinear response both of cathode ray tubes (for which the vast majority of digital video foundation work was targeted) and the human eye, using a "gamma curve" to provide an appearance of evenly distributed brightness steps across

1848-511: Is to ensure that the digital files themselves are preserved and remain accessible; the term " digital preservation ," in its most basic sense, refers to an array of activities undertaken to maintain access to digital materials over time. The prevalent Brittle Books issue facing libraries across the world is being addressed with a digital solution for long term book preservation. Since the mid-1800s, books were printed on wood-pulp paper , which turns acidic as it decays. Deterioration may advance to

1914-475: Is unique and workflows for one will be different from every other project that goes through the process, so time must be spent thoroughly studying and planning each one to create the best plan for the materials and the intended audience. Cost of equipment, staff time, metadata creation, and digital storage media make large scale digitization of collections expensive for all types of cultural institutions . Ideally, all institutions want their digital copies to have

1980-453: Is used to describe, for example, the scanning of analog sources (such as printed photos or taped videos ) into computers for editing, 3D scanning that creates 3D modeling of an object's surface, and audio (where sampling rate is often measured in kilohertz ) and texture map transformations. In this last case, as in normal photos, the sampling rate refers to the resolution of the image, often measured in pixels per inch. Digitizing

2046-510: Is usually integrated with some memory ( RAM ), which contains conversion tables for gamma correction , contrast and brightness, to make a device called a RAMDAC . A device that is distantly related to the DAC is the digitally controlled potentiometer , used to control an analog signal digitally. A one-bit mechanical actuator assumes two positions: one when on, another when off. The motion of several one-bit actuators can be combined and weighted with

Digitization - Misplaced Pages Continue

2112-647: Is working on the Cambridge Digital Library , which will initially contain digitised versions of many of its most important works relating to science and religion. These include examples such as Isaac Newton's personally annotated first edition of his Philosophiæ Naturalis Principia Mathematica as well as college notebooks and other papers, and some Islamic manuscripts such as a Quran from Tipu Sahib's library. Google, Inc. has taken steps towards attempting to digitize every title with " Google Book Search ". While some academic libraries have been contracted by

2178-446: The digital age "). Digital data come in these three states: data at rest , data in transit , and data in use . The confidentiality, integrity, and availability have to be managed during the entire lifecycle from 'birth' to the destruction of the data. All digital information possesses common properties that distinguish it from analog data with respect to communications: Even though digital signals are generally associated with

2244-408: The digital revolution . To illustrate, consider a typical long-distance telephone call. The caller's voice is converted into an analog electrical signal by a microphone , then the analog signal is converted to a digital stream by an ADC. The digital stream is then divided into network packets where it may be sent along with other digital data , not necessarily audio. The packets are then received at

2310-546: The Nyquist frequency, which will alias into the baseband frequency range. And the ADC's digital sampling process introduces some quantization error (rounding error), which manifests as low-level noise. These errors can be kept within the requirements of the targeted application (e.g. under the limited dynamic range of human hearing for audio applications). DACs and ADCs are part of an enabling technology that has contributed greatly to

2376-489: The best image quality so a high-quality copy can be maintained over time. In the mid-long term, digital storage would be regarded as the more expensive part to maintain the digital archives due to the increasing number of scanning requests. However, smaller institutions may not be able to afford such equipment or manpower, which limits how much material can be digitized, so archivists and librarians must know what their patrons need and prioritize digitization of those items. To help

2442-461: The binary electronic digital systems used in modern electronics and computing, digital systems are actually ancient, and need not be binary or electronic. Digital-to-analog conversion In electronics , a digital-to-analog converter ( DAC , D/A , D2A , or D-to-A ) is a system that converts a digital signal into an analog signal . An analog-to-digital converter (ADC) performs the reverse function. There are several DAC architectures ;

2508-803: The destination, but each packet may take a completely different route and may not even arrive at the destination in the correct time order. The digital voice data is then extracted from the packets and assembled into a digital data stream. A DAC converts this back into an analog electrical signal, which drives an audio amplifier , which in turn drives a speaker , which finally produces sound. Most modern audio signals are stored in digital form (for example MP3s and CDs ), and in order to be heard through speakers, they must be converted into an analog signal. DACs are therefore found in CD players , digital music players , and PC sound cards . Specialist standalone DACs can also be found in high-end hi-fi systems. These normally take

2574-492: The digital output of a compatible CD player or dedicated transport (which is basically a CD player with no internal DAC) and convert the signal into an analog line-level output that can then be fed into an amplifier to drive speakers. Similar digital-to-analog converters can be found in digital speakers such as USB speakers, and in sound cards . In voice over IP applications, the source must first be digitized for transmission, so it undergoes conversion via an ADC and

2640-446: The digital preservation field. Sometimes digitization and digital preservation are mistaken for the same thing. They are different, but digitization is often a vital first step in digital preservation. Libraries, archives, museums, and other memory institutions digitize items to preserve fragile materials and create more access points for patrons. Doing this creates challenges for information professionals and solutions can be as varied as

2706-455: The digitization process, scanning resolutions, and preferred file formats. Some of these standards are: A list of archival standards for digital preservation can be found on the ARL website. The Library of Congress has constituted a Preservation Digital Reformatting Program. The Three main components of the program include: Audio media offers a rich source of historic ethnographic information, with

Digitization - Misplaced Pages Continue

2772-442: The display's full dynamic range - hence the need to use RAMDACs in computer video applications with deep enough color resolution to make engineering a hardcoded value into the DAC for each output level of each channel impractical (e.g. an Atari ST or Sega Genesis would require 24 such values; a 24-bit video card would need 768...). Given this inherent distortion, it is not unusual for a television or video projector to truthfully claim

2838-841: The earliest forms of recorded sound dating back to 1890. According to the International Association of Sound and Audiovisual Archives (IASA), these sources of audio data, as well as the aging technologies used to play them back, are in imminent danger of permanent loss due to degradation and obsolescence. These primary sources are called “carriers” and exist in a variety of formats, including wax cylinders, magnetic tape, and flat discs of grooved media, among others. Some formats are susceptible to more severe, or quicker, degradation than others. For instance, lacquer discs suffer from delamination . Analog tape may deteriorate due to sticky shed syndrome . Archival workflow and file standardization have been developed to minimize loss of information from

2904-487: The expectation that everything should already be online. The time spent planning, doing the work, and processing the digital files along with the expense and fragility of some materials are some of the most common. Digitization is a time-consuming process, even more so when the condition or format of the analog resources requires special handling. Deciding what part of a collection to digitize can sometimes take longer than digitizing it in its entirety. Each digitization project

2970-401: The field can attend conferences and join organizations and working groups to keep their knowledge current and add to the conversation. The term digitization is often used when diverse forms of information, such as an object, text, sound, image, or voice, are converted into a single binary code . The core of the process is the compromise between the capturing device and the player device so that

3036-574: The frequency/resolution trade-off. The audio DAC is a low-frequency, high-resolution type while the video DAC is a high-frequency low- to medium-resolution type. Due to the complexity and the need for precisely matched components , all but the most specialized DACs are implemented as integrated circuits (ICs). These typically take the form of metal–oxide–semiconductor (MOS) mixed-signal integrated circuit chips that integrate both analog and digital circuits . Discrete DACs (circuits constructed from multiple discrete electronic components instead of

3102-481: The information institutions to better decide the archives worth of digitization, Casablancas and other researchers used a proposed model to investigate the impact of different digitization strategies on the decrease in access requests in the archival and library reading rooms. Often the cost of time and expertise involved with describing materials and adding metadata is more than the digitization process. Some materials, such as brittle books, are so fragile that undergoing

3168-575: The institutions that implement them. Some analog materials, such as audio and video tapes, are nearing the end of their life cycle, and it is important to digitize them before equipment obsolescence and media deterioration makes the data irretrievable. There are challenges and implications surrounding digitization including time, cost, cultural history concerns, and creating an equitable platform for historically marginalized voices. Many digitizing institutions develop their own solutions to these challenges. Mass digitization projects have had mixed results over

3234-478: The items for digital collections. It can be time consuming to make sure all potential copyright holders have given permission, but if copyright cannot be determined or cleared, it may be necessary to restrict even digital materials to in library use. Institutions can make digitization more cost-effective by planning before a project begins, including outlining what they hope to accomplish and the minimum amount of equipment, time, and effort that can meet those goals. If

3300-431: The number of bits in a single byte or word. Devices with many switches (such as a computer keyboard ) usually arrange these switches in a scan matrix, with the individual switches on the intersections of x and y lines. When a switch is pressed, it connects the corresponding x and y lines together. Polling (often called scanning in this case) is done by activating each x line in sequence and detecting which y lines then have

3366-432: The number of possible values of the signal at a given time , as well as in the number of points in the signal in a given period of time. However, digital signals are discrete in both of those respects – generally a finite sequence of integers – therefore a digitization can, in practical terms, only ever be an approximation of the signal it represents. Digitization occurs in two parts: In general, these can occur at

SECTION 50

#1732859156961

3432-521: The object, and digital form , for the signal. In modern practice, the digitized data is in the form of binary numbers , which facilitates processing by digital computers and other operations, but digitizing simply means "the conversion of analog source material into a numerical format"; the decimal or any other number system can be used instead. Digitization is of crucial importance to data processing, storage, and transmission, because it "allows information of all kinds in all formats to be carried with

3498-401: The only option for continued use. In the context of libraries, archives, and museums, digitization is a means of creating digital surrogates of analog materials, such as books, newspapers, microfilm and videotapes, offers a variety of benefits, including increasing access, especially for patrons at a distance; contributing to collection development, through collaborative initiatives; enhancing

3564-452: The original carrier to the resulting digital file as digitization is underway. For most at-risk formats (magnetic tape, grooved cylinders, etc.), a similar workflow can be observed. Examination of the source carrier will help determine what, if any, steps need to be taken to repair material prior to transfer. A similar inspection must be undertaken for the playback machines. If satisfactory conditions are met for both carrier and playback machine,

3630-432: The potential for research and education; and supporting preservation activities. Digitization can provide a means of preserving the content of the materials by creating an accessible facsimile of the object in order to put less strain on already fragile originals. For sounds, digitization of legacy analog recordings is essential insurance against technological obsolescence. A fundamental aspect of planning digitization projects

3696-506: The process of digitization could damage them irreparably. Despite potential damage, one reason for digitizing fragile materials is because they are so heavily used that creating a digital surrogate will help preserve the original copy long past its expected lifetime and increase access to the item. Copyright is not only a problem faced by projects like Google Books , but by institutions that may need to contact private citizens or institutions mentioned in archival documents for permission to scan

3762-648: The processed image. Digitization of analog tapes before they degrade, or after damage has already occurred, can rescue the only copies of local and traditional cultural music for future generations to study and enjoy. Academic and public libraries, foundations, and private companies like Google are scanning older print books and applying optical character recognition (OCR) technologies so they can be keyword searched, but as of 2006, only about 1 in 20 texts had been digitized. Librarians and archivists are working to increase this statistic and in 2019 began digitizing 480,000 books published between 1923 and 1964 that had entered

3828-414: The public domain. Unpublished manuscripts and other rare papers and documents housed in special collections are being digitized by libraries and archives , but backlogs often slow this process and keep materials with enduring historical and research value hidden from most users (see digital libraries ). Digitization has not completely replaced other archival imaging options, such as microfilming which

3894-510: The rendered result represents the original source with the most possible fidelity, and the advantage of digitization is the speed and accuracy in which this form of information can be transmitted with no degradation compared with analog information. Digital information exists as one of two digits, either 0 or 1. These are known as bits (a contraction of binary digits ) and the sequences of 0s and 1s that constitute information are called bytes . Analog signals are continuously variable, both in

3960-402: The same efficiency and also intermingled." Though analog data is typically more stable, digital data has the potential to be more easily shared and accessed and, in theory, can be propagated indefinitely without generation loss, provided it is migrated to new, stable formats as needed . This potential has led to institutional digitization projects designed to improve access and the rapid growth of

4026-420: The same time, though they are conceptually distinct. A series of digital integers can be transformed into an analog output that approximates the original analog signal. Such a transformation is called a digital-to-analog conversion . The sampling rate and the number of bits used to represent the integers combine to determine how close such an approximation to the analog signal a digitization will be. The term

SECTION 60

#1732859156961

4092-460: The service, issues of copyright law violations threaten to derail the project. However, it does provide – at the very least – an online consortium for libraries to exchange information and for researchers to search for titles as well as review the materials. Digitizing something is not the same as digitally preserving it. To digitize something is to create a digital surrogate (copy or format) of an existing analog item (book, photograph, or record) and

4158-568: The suitability of a DAC for a particular application is determined by figures of merit including: resolution , maximum sampling frequency and others. Digital-to-analog conversion can degrade a signal, so a DAC should be specified that has insignificant errors in terms of the application. DACs are commonly used in music players to convert digital data streams into analog audio signals . They are also used in televisions and mobile phones to convert digital video data into analog video signals . These two applications use DACs at opposite ends of

4224-545: The transfer can take place, moderated by an analog-to-digital converter . The digital signal is then represented visually for the transfer engineer by a digital audio workstation , like Audacity, WaveLab, or Pro Tools. Reference access copies can be made at smaller sample rates. For archival purposes, it is standard to transfer at a sample rate of 96 kHz and a bit depth of 24 bits per channel. Many libraries, archives, museums, and other memory institutions, struggle with catching up and staying current regarding digitization and

4290-618: The vertical axis, and assigns them a numerical value, while quantizing looks for measurements that are between binary values and rounds them up or down. Nearly all recorded music has been digitized, and about 12 percent of the 500,000+ movies listed on the Internet Movie Database are digitized and were released on DVD . Digitization of home movies , slides , and photographs is a popular method of preserving and sharing personal multimedia. Slides and photographs may be scanned quickly using an image scanner , but analog video requires

4356-568: The years, but some institutions have had success even if not in the traditional Google Books model. Although e-books have undermined the sales of their printed counterparts, a study from 2017 indicated that the two cater to different audiences and use-cases. In a study of over 1400 university students it was found that physical literature is more apt for intense studies while e-books provide a superior experience for leisurely reading. Technological changes can happen often and quickly, so digitization standards are difficult to keep updated. Professionals in

#960039