SAP HANA - Misplaced Pages

An in-memory database ( IMDb , or main memory database system ( MMDB ) or memory resident database ) is a database management system that primarily relies on main memory for computer data storage . It is contrasted with database management systems that employ a disk storage mechanism. In-memory databases are faster than disk-optimized databases because disk access is slower than memory access and the internal optimization algorithms are simpler and execute fewer CPU instructions. Accessing data in memory eliminates seek time when querying the data, which provides faster and more predictable performance than disk.

#705294

40-604: SAP HANA ( HochleistungsANalyseAnwendung or High-performance ANalytic Application) is an in-memory , column-oriented , relational database management system developed and marketed by SAP SE . Its primary function as the software running a database server is to store and retrieve data as requested by the applications. In addition, it performs advanced analytics ( predictive analytics , spatial data processing , text analytics, text search, streaming analytics , graph data processing ) and includes extract, transform, load (ETL) capabilities as well as an application server . During

80-630: A platform as a service offering called the SAP HANA Cloud Platform and a variant called SAP HANA One that used a smaller amount of memory. In May 2013, a managed private cloud offering called the HANA Enterprise Cloud service was announced. In May 2013, Business Suite on HANA became available, enabling customers to run SAP Enterprise Resource Planning functions on the HANA platform. S/4HANA , released in 2015, written specifically for

120-723: A SAP-certified HANA database should they choose the features offered by S/4HANA. Rather than versioning , the software utilizes service packs , referred to as Support Package Stacks (SPS), for updates. Support Package Stacks are released every 6 months. In November 2016 SAP announced SAP HANA 2, which offers enhancements to multiple areas such as database management and application management and includes two new cloud services: Text Analysis and Earth Observation Analysis. HANA customers can upgrade to HANA 2 from SPS10 and above. Customers running SPS9 and below must first upgrade to SPS12 before upgrading to HANA 2 SPS01. The key distinctions between HANA and previous generation SAP systems are that it

160-413: A Service basis, including: SAP also offer their own cloud services in the form of: SAP HANA licensing is primarily divided into two categories. Runtime License: Used to run SAP applications such as SAP Business Warehouse powered by SAP HANA and SAP S/4HANA . Full Use License: Used to run both SAP and non-SAP applications. This licensing can be used to create custom applications. As part of

200-520: A bitmap). Another example is the use of run-length encoding to encode a column. Column-oriented benefits from smaller compressed size. This is the result of a higher homogeneity within a column than within multiple rows. Because both orientations represent the same data, it is possible to convert a row-oriented dataset to a column-oriented dataset and vice-versa at the expense of compute. In particular, advanced query engines often leverage each orientation's advantages, and convert from one orientation to

240-580: A combined in-memory/on-disk database system. Some device product lines, especially in consumer electronics , include some units with permanent storage, and others that rely on memory for storage ( set-top boxes , for example). If such devices require a database system, a manufacturer can adopt a hybrid database system at lower and upper cost, and with less customization of code, rather than using separate in-memory and on-disk databases, respectively, for its disk-less and disk-based products. The first database engine to support both in-memory and on-disk tables in

280-508: A new row. Column-oriented benefits from fast insertion of a new column. This dimension is an important reason why row-oriented formats are more commonly used in Online transaction processing (OLTP), as it results in faster transactions in comparison to column-oriented. Row-oriented benefits from fast access under a filter. Column-oriented benefits from fast access under a projection. Column-oriented benefits from fast analytics operations. This

320-468: A number of analytic engines for various kinds of data processing. The Business Function Library includes a number of algorithms made available to address common business data processing algorithms such as asset depreciation , rolling forecast and moving average . The Predictive Analytics Library includes native algorithms for calculating common statistical measures in areas such as clustering , classification and time series analysis . HANA incorporates

360-404: A row store and a columnar store. Users can create tables using either store, but the columnar store has more capabilities and is most frequently used. The index server also manages persistence between cached memory images of database objects, log files and permanent storage files. The XS engine allows web applications to be built. SAP HANA Information Modeling (also known as SAP HANA Data Modeling)

400-402: A single column in the same location, rather than storing all data for a single row in the same location (row-oriented systems). This can enable performance improvements for OLAP queries on large datasets and allows greater vertical compression of similar types of data in a single column. If the read times for column-stored data is fast enough, consolidated views of the data can be performed on

440-418: A single database, WebDNA , was released in 1995. Another variation involves large amounts of nonvolatile memory in the server, for example, flash memory chips as addressable memory rather than structured as disk arrays. A database in this form of memory combines very fast access speed with persistence over reboots and power losses. Column-oriented DBMS Data orientation refers to how tabular data

SECTION 10

#1732883630706

480-544: A threshold of accuracy for results. Analyses available include identifying entities such as people, dates, places, organizations, requests, problems, and more. Such entity extraction can be catered to specific use cases such as Voice of the Customer (customer's preferences and expectations), Enterprise (i.e. mergers and acquisitions, products, organizations), and Public Sector (public persons, events, organizations). Custom extraction and dictionaries can also be implemented. Besides

520-468: Is a column-oriented , in-memory database , that combines OLAP and OLTP operations into a single system; thus in general SAP HANA is an "online transaction and analytical processing" (OLTAP) system, also known as a hybrid transactional/analytical processing (HTAP). Storing data in main memory rather than on disk provides faster data access and, by extension, faster querying and processing. While storing data in-memory confers performance advantages, it

560-415: Is a more costly form of data storage. Observing data access patterns, up to 85% of data in an enterprise system may be infrequently accessed therefore it can be cost-effective to store frequently accessed, or "hot", data in-memory while the less frequently accessed "warm" data is stored on disk, an approach SAP began to support in 2016 and termed "Dynamic tiering". Column-oriented systems store all data for

600-449: Is a part of HANA application development. Modeling is the methodology to expose operational data to the end user. Reusable virtual objects (named calculation views) are used in the modelling process. SAP HANA manages concurrency through the use of multiversion concurrency control (MVCC), which gives every transaction a snapshot of the database at a point in time. When an MVCC database needs to update an item of data, it will not overwrite

640-445: Is an important architectural decision of systems handling data because it results in important tradeoffs in performance and storage . Below are selected dimensions of this tradeoff. Row-oriented benefits from fast random access of rows. Column-oriented benefits from fast random access of columns. In both cases, this is the result of fewer page or cache misses when accessing the data. Row-oriented benefits from fast insertion of

680-443: Is cached, as opposed to the most frequently accessed data being stored in-memory. The flexibility of hybrid approaches allow a balance to be struck between: In the cloud computing industry the terms "data temperature", or "hot data" and "cold data" have emerged to describe how data is stored in this respect. Hot data is used to describe mission-critical data that needs to be accessed frequently while cold data describes data that

720-522: Is certified by the Open Geospatial Consortium , and it integrates with ESRI's ArcGIS geographic information system . In addition to numerical and statistical algorithms, HANA can perform text analytics and enterprise text search. HANA's search capability is based on “fuzzy” fault-tolerant search, much like modern web-based search engines. Results include a statistical measure for how relevant search results are, and search criteria can include

760-406: Is critical, such as those running telecommunications network equipment and mobile advertising networks, often use main-memory databases. IMDBs have gained much traction, especially in the data analytics space, starting in the mid-2000s – mainly due to multi-core processors that can address large memory and due to less expensive RAM . A potential technical hurdle with in-memory data storage

800-520: Is needed less often and less urgently, such as data kept for archiving or auditing purposes. Hot data should be stored in ways offering fast retrieval and modification, often accomplished by in-memory storage but not always. Cold data on the other hand can be stored in a more cost-effective way and is accepted that data access will likely be slower compared to hot data. While these descriptions are useful, "hot" and "cold" lack concrete definitions. Manufacturing efficiency provides another reason for selecting

840-873: Is represented in a linear memory model such as in-disk or in-memory .The two most common representations are column-oriented (columnar format) and row-oriented (row format). The choice of data orientation is a trade-off and an architectural decision in databases , query engines, and numerical simulations. As a result of these tradeoffs, row-oriented formats are more commonly used in Online transaction processing (OLTP) and column-oriented formats are more commonly used in Online analytical processing (OLAP). Examples of column-oriented formats include Apache ORC , Apache Parquet , Apache Arrow , formats used by BigQuery , Amazon Redshift and Snowflake . Predominant examples of row-oriented formats include CSV, formats used in most relational databases , in-memory format of Apache Spark , and Apache Avro . Tabular data

SECTION 20

#1732883630706

880-426: Is the result of being able to leverage SIMD instructions. Column-oriented benefits from smaller uncompressed size. This is the result of the possibility that this orientation offers to represent certain data types with dedicated encodings. For example, a table of 128 rows with a Boolean column requires 128 bytes a row-oriented format (one byte per Boolean) but 128 bits (16 bytes) in a column-oriented format (via

920-447: Is the volatility of RAM. Specifically in the event of a power loss, intentional or otherwise, data stored in volatile RAM is lost. With the introduction of non-volatile random-access memory technology, in-memory databases will be able to run at full speed and maintain data in the event of power failure. In its simplest form, main memory databases store data on volatile memory devices. These devices lose all stored information when

960-476: Is two dimensional in nature - data is represented in rows and columns. However, modern operating systems logically represent data in a linear memory model , both in-disk and in-memory. Therefore, a table in a linear memory model requires projecting its two-dimensional items in a one-dimensional space. Data orientation refers to the decision taken in this projection. There are two prominent choices of orientation: row-oriented and column-oriented. In row-oriented,

1000-525: The Hasso Plattner Institute and Stanford University demonstrated an application architecture for real-time analytics and aggregation using the name HYRISE. Former SAP SE executive, Vishal Sikka , mentioned this architecture as "Hasso's New Architecture". Before the name "HANA" stabilized, people referred to this product as "New Database". The software was previously called "SAP High-Performance Analytic Appliance". A first research paper on HYRISE

1040-521: The HANA platform, combines functionality for ERP , CRM , SRM and others into a single HANA system. S/4HANA is intended to be a simplified business suite, replacing earlier generation ERP systems. While it is likely that SAP will focus its innovations on S/4HANA, some customers using non-HANA systems have raised concerns of being locked into SAP products. Since S/4HANA requires an SAP HANA system to run, customers running SAP business suite applications on hardware not certified by SAP would need to migrate to

1080-433: The application server is a suite of application lifecycle management tools allowing development deployment and monitoring of user-facing applications. HANA can be deployed on-premises or in the cloud from a number of cloud service providers . HANA can be deployed on-premises as a new appliance from a certified hardware vendor. Alternatively, existing hardware components such as storage and network can be used as part of

1120-427: The benefits of in-memory storage while limiting its costs is to store the most frequently accessed data in-memory and the rest on disk. Since there is no hard distinction between which data should be stored in-memory and which should be stored on disk, some systems dynamically update where data is stored based on the data's usage. This approach is subtly different from caching , in which the most recently accessed data

1160-674: The database and data analytics capabilities, SAP HANA is a web-based application server , hosting user-facing applications tightly integrated with the database and analytics engines of HANA. The "XS Advanced Engine" (XSA) natively works with Node.js and JavaEE languages and runtimes. XSA is based on Cloud Foundry architecture and thus supports the notion of “Bring Your Own Language”, allowing developers to develop and deploy applications written in languages and in runtimes other than those XSA implements natively, as well as deploying applications as microservices . XSA also allows server-side JavaScript with SAP HANA XS Javascript (XSJS). Supporting

1200-452: The database – thus, faster-changing data that can easily be regenerated or that has no meaning after a system shut-down would not need to be journaled for durability (though it would have to be replicated for high availability), whereas configuration information would be flagged as needing preservation. While storing data in-memory confers performance advantages, it is an expensive method of data storage. An approach to realising

1240-531: The device loses power or is reset. In this case, IMDBs can be said to lack support for the "durability" portion of the ACID (atomicity, consistency, isolation, durability) properties. Volatile memory-based IMDBs can, and often do, support the other three ACID properties of atomicity, consistency and isolation. Many IMDBs have added durability via the following mechanisms: Some IMDBs allow the database schema to specify different durability requirements for selected areas of

SAP HANA - Misplaced Pages Continue

1280-405: The early development of SAP HANA, a number of technologies were developed or acquired by SAP SE . These included TREX search engine ( in-memory column-oriented search engine ), P*TIME (in-memory online transaction processing (OLTP) Platform acquired by SAP in 2005), and MaxDB with its in-memory liveCache engine. The first major demonstration of the platform was in 2011: teams from SAP SE ,

1320-554: The elements of the table are stored linearly as I.e. each row of the table is located one after the other. In this orientation, values on the same row are close in space (e.g. similar address in an addressable space). In column-oriented, the elements of the table are stored linearly as I.e. each column of the table is located one after the other. In this orientation, values on the same column are close in space (e.g. similar address in an addressable space). See list of column-oriented DBMSes for more examples. The data orientation

1360-496: The fly , removing the need for maintaining aggregate views and its associated data redundancy . Although row-oriented systems have traditionally been favored for OLTP , in-memory storage opens techniques to develop hybrid systems suitable for both OLAP and OLTP capabilities, removing the need to maintain separate systems for OLTP and OLAP operations. The index server performs session management, authorization, transaction management and command processing. The database has both

1400-569: The full use license, features are grouped as editions targeting various use cases. In addition, capabilities such as streaming and ETL are licensed as additional options. As of March 9, 2017, SAP HANA is available in an Express edition ; a streamlined version which can run on laptops and other resource-limited environments. The license for SAP HANA, express edition is free of charge, even for productive use up to 32 GB of RAM. Additional capacity increases can be purchased up to 128 GB of RAM. In memory Platform Applications where response time

1440-469: The graph engine include pattern matching, neighborhood search, single shortest path, and strongly connected components. Typical usage situations for the Graph Engine include examples like supply chain traceability, fraud detection, and logistics and route planning. HANA also includes a spatial database engine which implements spatial data types and SQL extensions for CRUD operations on spatial data. HANA

1480-455: The implementation, an approach which SAP calls "Tailored Data Center Integration (TDI)". HANA is certified to run on multiple operating systems including SUSE Linux Enterprise Server and Red Hat Enterprise Linux . Supported hardware platforms for on-premise deployment include Intel 64 and POWER Systems . The system is designed to support both horizontal and vertical scaling . Multiple cloud providers offer SAP HANA on an Infrastructure as

1520-401: The old data with new data, but will instead mark the old data as obsolete and add the newer version. In a scale-out environment, HANA can keep volumes of up to a petabyte of data in memory while returning query results in under a second. However, RAM is still much more expensive than disk space, so the scale-out approach is only feasible for certain time critical use cases. SAP HANA includes

1560-472: The open source statistical programming language R as a supported language within stored procedures . The column-store database offers graph database capabilities. The graph engine processes the Cypher Query Language and also has a visual graph manipulation via a tool called Graph Viewer. Graph data structures are stored directly in relational tables in HANA's column store. Pre-built algorithms in

1600-635: Was published in November 2010. The research engine is later released open source in 2013, and was reengineered in 2016 to become HYRISE2 in 2017. The first product shipped in late November 2010. By mid-2011, the technology had attracted interest but more experienced business customers considered it to be "in early days". HANA support for SAP NetWeaver Business Warehouse (BW) was announced in September 2011 for availability by November. In 2012, SAP promoted aspects of cloud computing . In October 2012, SAP announced

#705294