Misplaced Pages

Cosmos DB

Article snapshot taken from Wikipedia with creative commons attribution-sharealike license. Give it a read and then ask your questions in the chat. We can research this topic together.

Azure Cosmos DB is a globally distributed, multi-model database service offered by Microsoft . It is designed to provide high availability, scalability, and low-latency access to data for modern applications. Unlike traditional relational databases, Cosmos DB is a NoSQL (meaning "Not only SQL", rather than "zero SQL") and vector database, which means it can handle unstructured, semi-structured, structured, and vector data types.

#827172

64-560: Internally, Cosmos DB stores "items" in "containers", with these two concepts being surfaced differently depending on the API used (these would be "documents" in "collections" when using the MongoDB-compatible API, for example). Containers are grouped in "databases", which are analogous to namespaces above containers. Containers are schema-agnostic, which means that no schema is enforced when adding items. By default, every field in each item

128-402: A REST API, which itself is implemented in various SDKs that are officially supported by Microsoft and available for .NET Framework , .NET , Node.js ( JavaScript ), Java and Python . Cosmos DB added automatic partitioning capability in 2016 with the introduction of partitioned containers. Behind the scenes, partitioned containers span multiple physical partitions with items distributed by

192-544: A procedural language such as Lua could consist primarily of basic routines to execute code, manipulate data or handle errors while an API for an object-oriented language , such as Java, would provide a specification of classes and its class methods . Hyrum's law states that "With a sufficient number of users of an API, it does not matter what you promise in the contract: all observable behaviors of your system will be depended on by somebody." Meanwhile, several studies show that most applications that use an API tend to use

256-420: A software framework : a framework can be based on several libraries implementing several APIs, but unlike the normal use of an API, the access to the behavior built into the framework is mediated by extending its content with new classes plugged into the framework itself. Moreover, the overall program flow of control can be out of the control of the caller and in the framework's hands by inversion of control or

320-449: A broad term describing much of the communication on the internet. When used in this way, the term API has overlap in meaning with the term communication protocol . The interface to a software library is one type of API. The API describes and prescribes the "expected behavior" (a specification) while the library is an "actual implementation" of this set of rules. A single API can have multiple implementations (or none, being abstract) in

384-517: A business ecosystem. The main policies for releasing an API are: An important factor when an API becomes public is its "interface stability". Changes to the API—for example adding new parameters to a function call—could break compatibility with the clients that depend on that API. When parts of a publicly presented API are subject to change and thus not stable, such parts of a particular API should be documented explicitly as "unstable". For example, in

448-457: A client would need to know for practical purposes. Documentation is crucial for the development and maintenance of applications using the API. API documentation is traditionally found in documentation files but can also be found in social media such as blogs, forums, and Q&A websites. Traditional documentation files are often presented via a documentation system, such as Javadoc or Pydoc, that has

512-439: A client-supplied partition key. Cosmos DB automatically decides how many partitions to spread data across depending on the size and throughput needs. When partitions are added or removed, the operation is performed without any downtime so data remains available while it is re-balanced across the new or remaining partitions. Before partitioned containers were available, it was common to write custom code to partition data and some of

576-425: A computer to a person, an application programming interface connects computers or pieces of software to each other. It is not intended to be used directly by a person (the end user ) other than a computer programmer who is incorporating it into software. An API is often made up of different parts which act as tools or services that are available to the programmer. A program or a programmer that uses one of these parts

640-435: A consistent appearance and structure. However, the types of content included in the documentation differs from API to API. In the interest of clarity, API documentation may include a description of classes and methods in the API as well as "typical usage scenarios, code snippets, design rationales, performance discussions, and contracts", but implementation details of the API services themselves are usually omitted. It can take

704-524: A given API, it is possible to infer the typical usages, as well the required contracts and directives. Then, templates can be used to generate natural language from the mined data. In 2010, Oracle Corporation sued Google for having distributed a new implementation of Java embedded in the Android operating system. Google had not acquired any permission to reproduce the Java API, although permission had been given to

SECTION 10

#1732852833828

768-441: A modular software library in the 1940s for EDSAC , an early computer. The subroutines in this library were stored on punched paper tape organized in a filing cabinet . This cabinet also contained what Wilkes and Wheeler called a "library catalog" of notes about each subroutine and how to incorporate it into a program. Today, such a catalog would be called an API (or an API specification or API documentation) because it instructs

832-417: A new software product. The process of joining is called integration . As an example, consider a weather sensor that offers an API. When a certain message is transmitted to the sensor, it will detect the current weather conditions and reply with a weather report. The message that activates the sensor is an API call , and the weather report is an API response . A weather forecasting app might integrate with

896-470: A number of forms, including instructional documents, tutorials, and reference works. It'll also include a variety of information types, including guides and functionalities. Restrictions and limitations on how the API can be used are also covered by the documentation. For instance, documentation for an API function could note that its parameters cannot be null, that the function itself is not thread safe . Because API documentation tends to be comprehensive, it

960-414: A number of weather sensor APIs, gathering weather data from throughout a geographical area. An API is often compared to a contract . It represents an agreement between parties: a service provider who offers the API and the software developers who rely upon it. If the API remains stable, or if it changes only in predictable ways, developers' confidence in the API will increase. This may increase their use of

1024-440: A programmer on how to use (or "call") each subroutine that the programmer needs. Wilkes and Wheeler's book The Preparation of Programs for an Electronic Digital Computer contains the first published API specification. Joshua Bloch considers that Wilkes and Wheeler "latently invented" the API, because it is more of a concept that is discovered than invented. The term "application program interface" (without an -ing suffix)

1088-643: A shipping company API that can be added to an eCommerce-focused website to facilitate ordering shipping services and automatically include current shipping rates, without the site developer having to enter the shipper's rate table into a web database. While "web API" historically has been virtually synonymous with web service , the recent trend (so-called Web 2.0 ) has been moving away from Simple Object Access Protocol ( SOAP ) based web services and service-oriented architecture (SOA) towards more direct representational state transfer (REST) style web resources and resource-oriented architecture (ROA). Part of this trend

1152-404: A similar mechanism. An API can specify the interface between an application and the operating system . POSIX , for example, specifies a set of common APIs that aim to enable an application written for a POSIX conformant operating system to be compiled for another POSIX conformant operating system. Linux and Berkeley Software Distribution are examples of operating systems that implement

1216-441: A small part of the API. Language bindings are also APIs. By mapping the features and capabilities of one language to an interface implemented in another language, a language binding allows a library or service written in one language to be used when developing in another language. Tools such as SWIG and F2PY, a Fortran -to- Python interface generator, facilitate the creation of such interfaces. An API can also be related to

1280-463: A software system, used for machine-to-machine communication. A well-designed API exposes only objects or actions needed by software or software developers. It hides details that have no use. This abstraction simplifies programming. Building software using APIs has been compared to using building-block toys, such as Lego bricks. Software services or software libraries are analogous to the bricks; they may be joined together via their APIs, composing

1344-538: A system of commands and thereby bar all others from writing its different versions to carry out all or part of the same commands. Consistency (database systems) In database systems , consistency (or correctness ) refers to the requirement that any given database transaction must change affected data only in allowed ways. Any data written to the database must be valid according to all defined rules, including constraints , cascades , triggers , and any combination thereof. This does not guarantee correctness of

SECTION 20

#1732852833828

1408-454: Is a challenge for writers to keep the documentation updated and for users to read it carefully, potentially yielding bugs. API documentation can be enriched with metadata information like Java annotations . This metadata can be used by the compiler, tools, and by the run-time environment to implement custom behaviors or custom handling. It is possible to generate API documentation in a data-driven manner. By observing many programs that use

1472-416: Is a type of software interface , offering a service to other pieces of software . A document or standard that describes how to build such a connection or interface is called an API specification . A computer system that meets this standard is said to implement or expose an API. The term API may refer either to the specification or to the implementation. In contrast to a user interface , which connects

1536-512: Is an architectural approach that revolves around providing a program interface to a set of services to different applications serving different types of consumers. When used in the context of web development , an API is typically defined as a set of specifications, such as Hypertext Transfer Protocol (HTTP) request messages, along with a definition of the structure of response messages, usually in an Extensible Markup Language ( XML ) or JavaScript Object Notation ( JSON ) format. An example might be

1600-502: Is automatically indexed, generally providing good performance without tuning to specific query patterns. These defaults can be modified by setting an indexing policy which can specify, for each field, the index type and precision desired. Cosmos DB offers two types of indexes: Containers can also enforce unique key constraints to ensure data integrity. Each Cosmos DB container exposes a change feed, which clients can subscribe to in order to get notified of new items being added or updated in

1664-416: Is based on three trade-offs, one of which is "atomic consistency" (shortened to "consistency" for the acronym), about which the authors note, "Discussing atomic consistency is somewhat different than talking about an ACID database, as database consistency refers to transactions, while atomic consistency refers only to a property of a single request/response operation sequence. And it has a different meaning than

1728-516: Is created in one place dynamically can be posted and updated to multiple locations on the web. For example, Twitter's REST API allows developers to access core Twitter data and the Search API provides methods for developers to interact with Twitter Search and trends data. The design of an API has significant impact on its usage. The principle of information hiding describes the role of programming interfaces as enabling modular programming by hiding

1792-407: Is exclusively reserved for that container. The default maximum RUs that can be provisioned per database and per container are 1,000,000 RUs, but customers can get this limit increased by contacting customer support. As an example of costing, using a single region instance, a count of 1,000,000 records of 1k each in 5s requires 1,000,000 RUs At $ 0.008/ h , which would equal $ 800 . Two regions double

1856-399: Is first recorded in a paper called Data structures and techniques for remote computer graphics presented at an AFIPS conference in 1968. The authors of this paper use the term to describe the interaction of an application—a graphics program in this case—with the rest of the computer system. A consistent application interface (consisting of Fortran subroutine calls) was intended to free

1920-453: Is now the most common meaning of the term API. The Semantic Web proposed by Tim Berners-Lee in 2001 included "semantic APIs" that recast the API as an open , distributed data interface rather than a software behavior interface. Proprietary interfaces and agents became more widespread than open ones, but the idea of the API as a data interface took hold. Because web APIs are widely used to exchange data of all kinds online, API has become

1984-469: Is related to the Semantic Web movement toward Resource Description Framework (RDF), a concept to promote web-based ontology engineering technologies. Web APIs allow the combination of multiple APIs into new applications known as mashups . In the social media space, web APIs have allowed web communities to facilitate sharing content and data between communities and applications. In this way, content that

Cosmos DB - Misplaced Pages Continue

2048-411: Is said to call that portion of the API. The calls that make up the API are also known as subroutines , methods, requests, or endpoints . An API specification defines these calls, meaning that it explains how to use or implement them. One purpose of APIs is to hide the internal details of how a system works, exposing only those parts a programmer will find useful and keeping them consistent even if

2112-539: The Google Guava library, the parts that are considered unstable, and that might change soon, are marked with the Java annotation @Beta . A public API can sometimes declare parts of itself as deprecated or rescinded. This usually means that part of the API should be considered a candidate for being removed, or modified in a backward incompatible way. Therefore, these changes allow developers to transition away from parts of

2176-682: The Java language in particular. In the 1990s, with the spread of the internet , standards like CORBA , COM , and DCOM competed to become the most common way to expose API services. Roy Fielding 's dissertation Architectural Styles and the Design of Network-based Software Architectures at UC Irvine in 2000 outlined Representational state transfer (REST) and described the idea of a "network-based Application Programming Interface" that Fielding contrasted with traditional "library-based" APIs. XML and JSON web APIs saw widespread commercial adoption beginning in 2000 and continuing as of 2021. The web API

2240-501: The Java remote method invocation API uses the Java Remote Method Protocol to allow invocation of functions that operate remotely, but appear local to the developer. Therefore, remote APIs are useful in maintaining the object abstraction in object-oriented programming ; a method call , executed locally on a proxy object, invokes the corresponding method on the remote object, using the remoting protocol, and acquires

2304-535: The Linux Standard Base provides an ABI. Remote APIs allow developers to manipulate remote resources through protocols , specific standards for communication that allow different technologies to work together, regardless of language or platform. For example, the Java Database Connectivity API allows developers to query many different types of databases with the same set of functions, while

2368-697: The TLA+ specification language , with the TLA+ model being open-sourced on GitHub. Cosmos DB's original distribution model involves one single write region, with all other regions being read-only replicas. In March 2018, Microsoft announced a new multi-master capability for Azure Cosmos DB, allowing multiple regions to serve as write replicas. This feature introduced a significant improvement to its original single write-region model, where other regions were read-only. With multi-master, concurrent writes from different regions can lead to potential conflicts, which can be resolved either using

2432-417: The 1940s, though the term did not emerge until the 1960s and 70s. An API opens a software system to interactions from the outside. It allows two software systems to communicate across a boundary — an interface — using mutually agreed-upon signals. In other words, an API connects software entities together. Unlike a user interface , an API is typically not visible to users. It is an "under the hood" portion of

2496-456: The API that will be removed or not supported in the future. Client code may contain innovative or opportunistic usages that were not intended by the API designers. In other words, for a library with a significant user base, when an element becomes part of the public API, it may be used in diverse ways. On February 19, 2020, Akamai published their annual “State of the Internet” report, showcasing

2560-435: The API. The term API initially described an interface only for end-user-facing programs, known as application programs . This origin is still reflected in the name "application programming interface." Today, the term is broader, including also utility software and even hardware interfaces . The idea of the API is much older than the term itself. British computer scientists Maurice Wilkes and David Wheeler worked on

2624-507: The Azure Cosmos DB, without any impact to its transactional workloads. This feature addresses the complexity and latency challenges that occur with the traditional ETL pipelines required to have a data repository optimized to execute Online analytical processing by automatically syncing the operational data into a separate column store suitable for large scale analytical queries to be performed in an optimized manner, resulting in improving

Cosmos DB - Misplaced Pages Continue

2688-519: The Cosmos DB SDKs explicitly supported several different partitioning schemes. That mode is still available but only recommended when storage and throughput requirements do not exceed the capacity of one container, or when the built-in partitioning capability does not otherwise meet the application's needs. Developers can specify desired throughput to match the application's expected load. Cosmos DB reserves resources (memory, CPU and IOPS ) to guarantee

2752-496: The POSIX APIs. Microsoft has shown a strong commitment to a backward-compatible API, particularly within its Windows API (Win32) library, so older applications may run on newer versions of Windows using an executable-specific setting called "Compatibility Mode". An API differs from an application binary interface (ABI) in that an API is source code based while an ABI is binary based. For instance, POSIX provides APIs while

2816-401: The application programming interface separately from other interfaces, such as the query interface. Database professionals in the 1970s observed these different interfaces could be combined; a sufficiently rich application interface could support the other interfaces as well. This observation led to APIs that supported all types of programming, not just application programming. By 1990, the API

2880-399: The complexity of the operations needed. The minimum billing is per hour. Throughput can be provisioned at either the container or the database level. When provisioned at the database level, the throughput is shared across all the containers within that database, with the additional ability to have dedicated throughput for some containers. The throughput provisioned on an Azure Cosmos container

2944-430: The container. As of 7 June 2021 , item deletions are currently not exposed by the change feed. Changes are persisted by Cosmos DB, which makes it possible to request changes from any point in time since the creation of the container. A " Time to Live " (or TTL) can be specified at the container level to let Cosmos DB automatically delete items after a certain amount of time expressed in seconds. This countdown starts after

3008-633: The cost. Cosmos DB databases can be configured to be available in any of the Microsoft Azure regions (54 regions as of December 2018), letting application developers place their data closer to where their users are. Each container's data gets transparently replicated across all configured regions. Adding or removing regions is performed without any downtime or impact on performance. By leveraging Cosmos DB's multi-homing API, applications don't have to be updated or redeployed when regions are added or removed, as Cosmos DB will automatically route their requests to

3072-463: The default "Last Write Wins" (LWW) policy or a custom conflict resolution mechanism, such as a JavaScript function. The LWW policy relies on timestamps to determine the winning write, while the custom option enables developers to handle conflicts through application-defined rule. This feature, announced in May 2020, is a fully isolated column store for enabling large scale analytics against operational data in

3136-420: The form of different libraries that share the same programming interface. The separation of the API from its implementation can allow programs written in one language to use a library written in another. For example, because Scala and Java compile to compatible bytecode , Scala developers can take advantage of any Java API. API use can vary depending on the type of programming language involved. An API for

3200-476: The growing trend of cybercriminals targeting public API platforms at financial services worldwide. From December 2017 through November 2019, Akamai witnessed 85.42 billion credential violation attacks. About 20%, or 16.55 billion, were against hostnames defined as API endpoints. Of these, 473.5 million have targeted financial services sector organizations. API documentation describes what services an API offers and how to use those services, aiming to cover everything

3264-490: The implementation details of the modules so that users of modules need not understand the complexities inside the modules. Thus, the design of an API attempts to provide only the tools a user would expect. The design of programming interfaces represents an important part of software architecture , the organization of a complex piece of software. APIs are one of the more common ways technology companies integrate. Those that provide and use APIs are considered as being members of

SECTION 50

#1732852833828

3328-405: The interactions between an application and its Cosmos DB database. The profiler alerts users to wasted performance and excessive cloud expenditures. It also recommends how to resolve them by isolating and analyzing the code and directing its users to the exact location. API An application programming interface ( API ) is a connection between computers or between computer programs . It

3392-452: The internal details later change. An API may be custom-built for a particular pair of systems, or it may be a shared standard allowing interoperability among many systems. The term API is often used to refer to web APIs , which allow communication between computers that are joined by the internet . There are also APIs for programming languages , software libraries , computer operating systems , and computer hardware . APIs originated in

3456-505: The last update of the item. If needed, the TTL can also be overloaded at the item level. The internal data model described in the previous section is exposed through: The SQL API lets clients create, update and delete containers and items. Items can be queried with a read-only, JSON -friendly SQL dialect. As Cosmos DB embeds a JavaScript engine , the SQL API also enables: The SQL API is exposed as

3520-571: The latency of such queries. Using Microsoft Azure Synapse Link for Cosmos DB, it is possible to build no-ETL Hybrid transactional/analytical processing solutions by directly linking to Azure Cosmos DB analytical store from Synapse Analytics. It enables to run near real-time large-scale analytics directly on the operational data. Gartner Research positions Microsoft as the leader in the Magic Quadrant Operational Database Management Systems in 2016 and calls out

3584-513: The latest value of the Record. Consistency is one of the four guarantees that define ACID transactions ; however, significant ambiguity exists about the nature of this guarantee. It is defined variously as: As these various definitions are not mutually exclusive, it is possible to design a system that guarantees "consistency" in every sense of the word, as most relational database management systems in common use today arguably do. The CAP theorem

3648-595: The programmer from dealing with idiosyncrasies of the graphics display device, and to provide hardware independence if the computer or the display were replaced. The term was introduced to the field of databases by C. J. Date in a 1974 paper called The Relational and Network Approaches: Comparison of the Application Programming Interface . An API became a part of the ANSI/SPARC framework for database management systems . This framework treated

3712-484: The regions that are available and closest to their location. Data consistency is configurable on Cosmos DB, letting application developers choose among five different levels: The desired consistency level is defined at the account level but can be overridden on a per request basis by using a specific HTTP header or the corresponding feature exposed by the SDKs. All five consistency levels have been specified and verified using

3776-553: The requested throughput while maintaining request latency below 10ms for both reads and writes at the 99th percentile . Throughput is specified in Request Units (RUs) per second. The cost to read a 1 KB item is 1 Request Unit (or 1 RU). Select by 'id' operations consume lower number of RUs compared to Delete, Update, and Insert operations for the same document. Large queries (e.g. aggregations like count) and stored procedure executions can consume hundreds to thousands of RUs depending on

3840-442: The result to be used locally as a return value. A modification of the proxy object will also result in a corresponding modification of the remote object. Web APIs are the defined interfaces through which interactions happen between an enterprise and applications that use its assets, which also is a Service Level Agreement (SLA) to specify the functional provider and expose the service path or URL for its API users. An API approach

3904-448: The similar OpenJDK project. Judge William Alsup ruled in the Oracle v. Google case that APIs cannot be copyrighted in the U.S. and that a victory for Oracle would have widely expanded copyright protection to a "functional set of symbols" and allowed the copyrighting of simple software commands: To accept Oracle's claim would be to allow anyone to copyright one version of code to carry out

SECTION 60

#1732852833828

3968-421: The transaction in all ways the application programmer might have wanted (that is the responsibility of application-level code) but merely that any programming errors cannot result in the violation of any defined database constraints. In a distributed system, referencing CAP theorem , consistency can also be understood as after a successful write, update or delete of a Record, any read request immediately receives

4032-441: The unique capabilities of Cosmos DB in their write-up. Microsoft utilizes Cosmos DB in many of its own apps, including Microsoft Office , Skype , Active Directory , Xbox , and MSN . In building a more globally-resilient application / system, Cosmos DB combines with other Azure services, such as Azure App Services and Azure Traffic Manager. The Cosmos DB Profiler cloud cost optimization tool detects inefficient data queries in

4096-455: Was defined simply as "a set of services available to a programmer for performing certain tasks" by technologist Carl Malamud . The idea of the API was expanded again with the dawn of remote procedure calls and web APIs . As computer networks became common in the 1970s and 80s, programmers wanted to call libraries located not only on their local computers, but on computers located elsewhere. These remote procedure calls were well supported by

#827172