55-763: Symonston ( postcode : 2609) is a primarily industrial and agricultural suburb of Canberra , Australian Capital Territory , Australia . Symonston is named after Sir Josiah Symon a Legislator, Federalist and one of the Founders of the Constitution of Australia. Located in Symonston are the Periodic Detention Centre and Symonston Temporary Remand Centre and three caravan parks: Canberra South Motor Park, Sundown Village and Narrabundah Longstay Caravan Park. Geoscience Australia has its headquarters in Symonston, as does
110-694: A Large Volume Receiver (LVR), e.g. the Royal Brisbane and Women's Hospital has the postcode 4029, the Australian National University had the postcode 0200. More postcode ranges were made available for LVRs in the 1990s. Australia Post has been progressively discontinuing the LVR programme since 2006. Some places in Australia do not have postcodes. These are typically remote areas with little or no population. The first one or two numerals usually show
165-461: A dropout color which can be easily removed by the OCR system. Palm OS used a special set of glyphs, known as Graffiti , which are similar to printed English characters but simplified or modified for easier recognition on the platform's computationally limited hardware. Users would need to learn how to write these special glyphs. Zone-based OCR restricts the image to a specific part of a document. This
220-452: A "Statistical Machine" for searching microfilm archives using an optical code recognition system. In 1931, he was granted US Patent number 1,838,389 for the invention. The patent was acquired by IBM . In 1974, Ray Kurzweil started the company Kurzweil Computer Products, Inc. and continued development of omni- font OCR, which could recognize text printed in virtually any font. (Kurzweil is often credited with inventing omni-font OCR, but it
275-439: A character error rate of 1% (99% accuracy) may result in an error rate of 5% or worse if the measurement is based on whether each whole word was recognized with no incorrect letters. Using a large enough dataset is important in a neural-network-based handwriting recognition solutions. On the other hand, producing natural datasets is very complicated and time-consuming. An example of the difficulties inherent in digitizing old text
330-464: A list of optical character recognition software, see Comparison of optical character recognition software . OCR accuracy can be increased if the output is constrained by a lexicon – a list of words that are allowed to occur in a document. This might be, for example, all the words in the English language, or a more technical lexicon for a specific field. This technique can be problematic if
385-554: A noun, for example, allowing greater accuracy. The Levenshtein Distance algorithm has also been used in OCR post-processing to further optimize results from an OCR API. In recent years, the major OCR technology providers began to tweak OCR systems to deal more efficiently with specific types of input. Beyond an application-specific lexicon, better performance may be had by taking into account business rules, standard expression, or rich information contained in color images. This strategy
440-579: A number of localities across WA, SA and NT. Three locations (Mingoola, Mungindi and Texas ) straddle the NSW-Queensland border (so same town name and postcode on both sides). Jervis Bay Territory is located on the coast of NSW. It is just south of the towns of Vincentia and Huskisson, with which it shares a postcode. Mail to the Jervis Bay Territory is addressed to the ACT. The numerals used to show
495-399: A ranked list of candidate characters. Software such as Cuneiform and Tesseract use a two-pass approach to character recognition. The second pass is known as adaptive recognition and uses the letter shapes recognized with high confidence on the first pass to better recognize the remaining letters on the second pass. This is advantageous for unusual fonts or low-quality scans where the font
550-411: A single character) – are still the subject of active research. The MNIST database is commonly used for testing systems' ability to recognize handwritten digits. Accuracy rates can be measured in several ways, and how they are measured can greatly affect the reported accuracy rate. For example, if word context (a lexicon of words) is not used to correct software finding non-existent words,
605-588: A small out crop east of Jerrabomberra creek. Geoscience Australia is the Australian Government organisation tasked with supplying scientific information and knowledge about the geography and geology of Australia . It has information about the geology of the whole of Australia including Canberra. Symonston residents get preference for: Postcodes in Australia Postcodes in Australia are used to more efficiently sort and route mail within
SECTION 10
#1733086180864660-625: A sorting number which is printed on the letter as an orange-coloured barcode . This number might be 12 digits long. This system enables the post office to place items in delivery order. Companies can also use Rapid Addressing Tool (RATS) to print Customer Addressed Barcodes, which are printed above the address. Every Delivery Point in Australia (DPID) has its own number. Barcode sorter machines sort and sequence letters up to C5 size. Flat multi-level OCR reads, imprints and sorts C5 to A3 sized articles. Optical character recognition Optical character recognition or optical character reader ( OCR )
715-451: A time. Advanced systems capable of producing a high degree of accuracy for most fonts are now common, and with support for a variety of image file format inputs. Some systems are capable of reproducing formatted output that closely approximates the original page including images, columns, and other non-textual components. Early optical character recognition may be traced to technologies involving telegraphy and creating reading devices for
770-479: Is a common method of digitizing printed texts so that they can be electronically edited, searched, stored more compactly, displayed online, and used in machine processes such as cognitive computing , machine translation , (extracted) text-to-speech , key data and text mining . OCR is a field of research in pattern recognition , artificial intelligence and computer vision . Early versions needed to be trained with images of each character, and worked on one font at
825-428: Is accomplished relatively simply by aligning the image to a uniform grid based on where vertical grid lines will least often intersect black areas. For proportional fonts , more sophisticated techniques are needed because whitespace between letters can sometimes be greater than that between words, and vertical lines can intersect more than one character. There are two basic types of core OCR algorithm, which may produce
880-521: Is an inherent problem with spatial representation of postcodes since areas referenced to a specific postal code tend to change over time, as new postcodes are added or existing ones are split by Australia Post for operational purposes. Some mail generating software can automatically sort outward mail by postcode before it is generated. This pre-sort of mail into postcode order eliminates the need for it to be sorted manually or using mail sorting machines . To improve mechanised sorting, each address now has
935-548: Is called "Application-Oriented OCR" or "Customized OCR", and has been applied to OCR of license plates , invoices , screenshots , ID cards , driver's licenses , and automobile manufacturing . The New York Times has adapted the OCR technology into a proprietary tool they entitle Document Helper , that enables their interactive news team to accelerate the processing of documents that need to be reviewed. They note that it enables them to process what amounts to as many as 5,400 pages per hour in preparation for reporters to review
990-413: Is distorted (e.g. blurred or faded). As of December 2016 , modern OCR software includes Google Docs OCR, ABBYY FineReader , and Transym. Others like OCRopus and Tesseract use neural networks which are trained to recognize whole lines of text instead of focusing on single characters. A technique known as iterative OCR automatically crops a document into sections based on the page layout. OCR
1045-411: Is generally an offline process, which analyses a static document. There are cloud based services which provide an online OCR API service. Handwriting movement analysis can be used as input to handwriting recognition . Instead of merely using the shapes of glyphs and words, this technique is able to capture motion, such as the order in which segments are drawn, the direction, and the pattern of putting
1100-664: Is intended for a specific identity within an organisation, their identity can be prepended on the line above the business name, with c/- prepended to the business name. If addressing a letter from outside Australia, the postcode is recorded before 'Australia', which is placed on a fourth line. Australian postcodes are sorting information. They are often linked with one area (e.g. 6160 belongs only to Fremantle, Western Australia ). Due to postcode rationalisation, they can be quite complex, especially in country areas (e.g. 2570 belongs to twenty-two towns and suburbs around Camden, New South Wales ). The south-western Victoria 3221 postcode of
1155-476: Is often referred to as Template OCR . Crowdsourcing humans to perform the character recognition can quickly process images like computer-driven OCR, but with higher accuracy for recognizing images than that obtained via computers. Practical systems include the Amazon Mechanical Turk and reCAPTCHA . The National Library of Finland has developed an online interface for users to correct OCRed texts in
SECTION 20
#17330861808641210-608: Is the electronic or mechanical conversion of images of typed, handwritten or printed text into machine-encoded text, whether from a scanned document, a photo of a document, a scene photo (for example the text on signs and billboards in a landscape photo) or from subtitle text superimposed on an image (for example: from a television broadcast). Widely used as a form of data entry from printed paper data records – whether passport documents, invoices, bank statements , computerized receipts, business cards, mail, printed data, or any suitable documentation – it
1265-434: Is the inability of OCR to differentiate between the " long s " and "f" characters. Web-based OCR systems for recognizing hand-printed text on the fly have become well known as commercial products in recent years (see Tablet PC history ). Accuracy rates of 80% to 90% on neat, clean hand-printed characters can be achieved by pen computing software, but that accuracy rate still translates to dozens of errors per page, making
1320-552: Is then performed on each section individually using variable character confidence level thresholds to maximize page-level OCR accuracy. A patent from the United States Patent Office has been issued for this method. The OCR result can be stored in the standardized ALTO format, a dedicated XML schema maintained by the United States Library of Congress . Other common formats include hOCR and PAGE XML. For
1375-450: The Amount line of a check (which is always a written-out number) is an example where using a smaller dictionary can increase recognition rates greatly. The shapes of individual cursive characters themselves simply do not contain enough information to accurately (greater than 98%) recognize all handwritten cursive script. Most programs allow users to set "confidence rates". This means that if
1430-629: The Australian National Antarctic Research Expeditions share a postcode with the isolated sub-Antarctic island of Macquarie Island (part of Tasmania): There is also a "special" postcode for routing mail sent to Santa: Each state's capital city ends with three zeroes, while territorial capital cities end with two zeroes. Capital city postcodes were the lowest postcodes in their state or territory range, before new ranges for LVRs and PO Boxes were made available. The last numeral can usually be changed from "0" to "1" to get
1485-581: The Australian National University (now 2601) to 9944 for Cannonvale , Queensland. Some towns and suburbs have two postcodes — one for street deliveries and another for post office boxes . For example, a street address in the Sydney suburb of Parramatta would be written like this: But mail sent to a PO Box in Parramatta would be addressed: Many large businesses, government departments and other institutions receiving high volumes of mail had their own postcode as
1540-532: The Geelong Mail Centre also includes twenty places around Geelong with very few people. This means that mail for these places is not fully sorted until it gets to Geelong. Some postcodes cover large populations (e.g., postcode 4350 serving some 100,000 people in Toowoomba and the surrounding area), while other postcodes have much smaller populations, even in urban areas. Australian postcodes range from 0200 for
1595-511: The Silurian age Mount Painter Volcanics dark grey to green grey dacitic tuff underlies most of Symonston. Narrabundah Ashstone Member is found in the northern corner near the motor park. Canberra Formation, calcareous shale found to the east of the ashstone. Further to the east and over to Harman is found the dacitic andesite of the Ainslie Volcanics. An unnamed coarse leucogranite has
1650-798: The Therapeutic Goods Administration . The Symonston area has traditionally been denoted 'Broadacre' area by the planning authorities, meaning that it has retained a traditionally rural character with some larger institution uses, particularly by the Australian Defence Force and Geoscience Australia . With the release of the Canberra Spatial Plan by the ACT Government, the area and the adjoining Majura Valley has been denoted as an employment corridor centred on Canberra Airport and Fyshwick . Rocks in Symonston are from
1705-469: The 2000s, OCR was made available online as a service (WebOCR), in a cloud computing environment, and in mobile applications like real-time translation of foreign-language signs on a smartphone . With the advent of smartphones and smartglasses , OCR can be used in internet connected mobile device applications that extract text captured using the device's camera. These devices that do not have built-in OCR functionality will typically use an OCR API to extract
Symonston, Australian Capital Territory - Misplaced Pages Continue
1760-676: The Australian postal system. Postcodes in Australia have four digits and are placed at the end of the Australian address, before the country. Postcodes were introduced in Australia in 1967 by the Postmaster-General's Department and are now managed by Australia Post , Australia's national postal service. Postcodes are published in booklets available from post offices or online from the Australia Post website. Australian envelopes and postcards often have four square boxes printed in orange at
1815-458: The blind. In 1914, Emanuel Goldberg developed a machine that read characters and converted them into standard telegraph code. Concurrently, Edmund Fournier d'Albe developed the Optophone , a handheld scanner that when moved across a printed page, produced tones that corresponded to specific letters or characters. In the late 1920s and into the 1930s, Emanuel Goldberg developed what he called
1870-447: The bottom right for the postcode. These are used to assist with the automated sorting of mail that has been addressed by hand for Australian delivery. Postcodes were introduced in Australia in 1967 by the Postmaster-General's Department (PMG) to replace earlier postal sorting systems, such as Melbourne's letter and number codes (e.g., N3 , E5 ) and a similar system then used in rural and regional New South Wales . The introduction of
1925-828: The contents. There are several techniques for solving the problem of character recognition by means other than improved OCR algorithms. Special fonts like OCR-A , OCR-B , or MICR fonts, with precisely specified sizing, spacing, and distinctive character shapes, allow a higher accuracy rate during transcription in bank check processing. Several prominent OCR engines were designed to capture text in popular fonts such as Arial or Times New Roman, and are incapable of capturing text in these fonts that are specialized and very different from popularly used fonts. As Google Tesseract can be trained to recognize new fonts, it can recognize OCR-A, OCR-B and MICR fonts. Comb fields are pre-printed boxes that encourage humans to write more legibly – one glyph per box. These are often printed in
1980-481: The cost of car and house insurance. Transport for NSW uses postcodes to give specific numbers for each bus stop in Greater Sydney . The stop number is five to seven numbers: the first four are the postcode, and the others show the bus stop (sometimes written with a space in between, e.g. "2000 108"). Many companies that produce metropolitan street maps also list the postcodes for each suburb, both in indexes and on
2035-406: The document contains words not in the lexicon, like proper nouns . Tesseract uses its dictionary to influence the character segmentation step, for improved accuracy. The output stream may be a plain text stream or file of characters, but more sophisticated OCR systems can preserve the original layout of the page and produce, for example, an annotated PDF that includes both the original image of
2090-581: The leaders of the National Federation of the Blind . In 1978, Kurzweil Computer Products began selling a commercial version of the optical character recognition computer program. LexisNexis was one of the first customers, and bought the program to upload legal paper and news documents onto its nascent online databases. Two years later, Kurzweil sold his company to Xerox , which eventually spun it off as Scansoft , which merged with Nuance Communications . In
2145-404: The maps themselves. Spatial representation of postcodes is also very popular in defining sales and franchise territories. Australian Bureau of Statistics, Australian Taxation Office and many other Federal and State government organisations publish a variety of statistics by postcode which extends the use Australia Post four digit code to many business and social planning related activities. There
2200-716: The most authoritative of the Annual Test of OCR Accuracy from 1992 to 1996. Recognition of typewritten, Latin script text is still not 100% accurate even where clear imaging is available. One study based on recognition of 19th- and early 20th-century newspaper pages concluded that character-by-character OCR accuracy for commercial OCR software varied from 81% to 99%; total accuracy can be achieved by human review or Data Dictionary Authentication. Other areas – including recognition of hand printing, cursive handwriting, and printed text in other scripts (especially those East Asian language characters which have many strokes for
2255-400: The name of the city, suburb, or town, and the state or territory: Recipient Name 100 Citizen Road BLACKTOWN NSW 2148 When writing an address by hand, and a row of four boxes is pre-printed on the lower right-hand corner of an envelope, the postcode may be written in the boxes instead. When posting to an organisation, business names can be written instead of a recipient name. If an article
Symonston, Australian Capital Territory - Misplaced Pages Continue
2310-706: The numbering was based on the (now retired) Australian Military Administrative Districts, however this is incorrect. For example, in that numbering scheme Queensland is 1, and 4 is South Australia. By 1968, 75% of mail was using postcodes, and in the same year post office preferred-size envelopes were introduced, which came to be referred to as "standard envelopes". Postcode squares were introduced in June 1990 to enable Australia Post to use optical character recognition (OCR) software in its mail sorting machines to automatically and more quickly sort mail by postcodes. Australian postcodes consist of four digits, and are written after
2365-406: The page and a searchable textual representation. Near-neighbor analysis can make use of co-occurrence frequencies to correct errors, by noting that certain words are often seen together. For example, "Washington, D.C." is generally far more common in English than "Washington DOC". Knowledge of the grammar of the language being scanned can also help determine if a word is likely to be a verb or
2420-416: The pen down and lifting it. This additional information can make the process more accurate. This technology is also known as "online character recognition", "dynamic character recognition", "real-time character recognition", and "intelligent character recognition". OCR software often pre-processes images to improve the chances of successful recognition. Techniques include: Segmentation of fixed-pitch fonts
2475-580: The postcode 2750 and Petrie, Queensland has the postcode 4502. Within each region with the same second numeral, postcodes are usually allocated in ascending order the further one travels from the state's capital city along major highways and railways. For instance, heading north on the North Coast railway in New South Wales away from Sydney: Major towns and cities tend to have "0" as the last numeral or last two numerals, e.g. Rockhampton, Queensland has
2530-915: The postcode 4700 and Ballarat, Victoria has the postcode 3350. There are exceptions; the major town of Ipswich, Queensland has the postcode 4305, while Goodna , a relatively small suburb of Ipswich, is allocated 4300. Australia Post prefers envelopes sold in Australia to have four boxes in the bottom right corner of the front of the envelope. Entering the postcode in these boxes or squares, which Australia Post calls postcode squares , enables Australia Post to use optical character recognition software in its mail sorting machines to automatically and more quickly sort mail into postcodes, which also embeds routing information. Postcode squares were introduced in June 1990. Australia Post says that postcode squares should not be used for machine-addressed mail, or mail going overseas. Many other organisations now use postcodes. Insurance companies often use postcodes when working out
2585-425: The postcode for General Post Office boxes in any capital city (though Perth now uses a different range of postcodes for its GPO Boxes): While the first numeral of a postcode usually shows the state or territory, the second numeral usually shows a region within the state. However, postcodes with the same second numeral are not always next to each other. As an example, postcodes in the range 2200–2299 are split between
2640-514: The postcodes coincided with the introduction of a large-scale mechanical mail sorting system in Australia, starting with the Sydney GPO . The first digit for each state's postcode was copied directly from Australian radio call signs . Over time, digits beyond this set have come to be used for PO Boxes etc., for example the 8000 series refers to special addresses in Victoria. It is sometimes stated that
2695-593: The southern suburbs of Sydney and the Central Coast , Lake Macquarie and Newcastle regions of New South Wales. Postcodes with a second numeral of "0" or "1" are almost always located within the metropolitan area of the state's capital city. Postcodes with higher second numerals are usually located in rural and regional areas. Common exceptions are where towns were rural when postcodes were first introduced in 1967, but have since been suburbanised and incorporated into metropolitan areas, e.g. Penrith, New South Wales has
2750-561: The standardized ALTO format. Crowd sourcing has also been used not to perform character recognition directly but to invite software developers to develop image processing algorithms, for example, through the use of rank-order tournaments . Commissioned by the U.S. Department of Energy (DOE), the Information Science Research Institute (ISRI) had the mission to foster the improvement of automated technologies for understanding machine printed documents, and it conducted
2805-538: The state on each radio callsign in Australia are the same numeral as the first numeral for postcodes in that state, e.g. 2xx in New South Wales, 3xx in Victoria, etc. Radio callsigns pre-date postcodes in Australia by more than forty years. Australia's external territories are also included in Australia Post's postcode system. While these territories do not belong to any state, they are addressed as such for mail sorting: Three scientific bases in Antarctica operated by
SECTION 50
#17330861808642860-445: The state or territory that the postcode belongs to Sometimes near the state and territory borders, Australia Post finds it easier to send mail through a nearby post office that is across the border: Some of the postcodes above may cover two or more states. For example, postcode 2620 covers both a locality in NSW ( Gundaroo ) as well as a locality in the ACT ( Hume ), and postcode 0872 covers
2915-458: The technology useful only in very limited applications. Recognition of cursive text is an active area of research, with recognition rates even lower than that of hand-printed text . Higher rates of recognition of general cursive script will likely not be possible without the use of contextual or grammatical information. For example, recognizing entire words from a dictionary is easier than trying to parse individual characters from script. Reading
2970-688: The text from the image file captured by the device. The OCR API returns the extracted text, along with information about the location of the detected text in the original image back to the device app for further processing (such as text-to-speech) or display. Various commercial and open source OCR systems are available for most common writing systems , including Latin, Cyrillic, Arabic, Hebrew, Indic, Bengali (Bangla), Devanagari, Tamil, Chinese, Japanese, and Korean characters. OCR engines have been developed into software applications specializing in various subjects such as receipts, invoices, checks, and legal billing documents. The software can be used for: OCR
3025-402: Was in use by companies, including CompuScan, in the late 1960s and 1970s. ) Kurzweil used the technology to create a reading machine for blind people to have a computer read text to them out loud. The device included a CCD -type flatbed scanner and a text-to-speech synthesizer. On January 13, 1976, the finished product was unveiled during a widely reported news conference headed by Kurzweil and
#863136