Companies using SAP HANA have new opportunities to deploy powerful and certified scale-out systems for their mission-critical virtualized infrastructures in their data centers. The SAP Standard Application Benchmarks were developed to help partners and customers identify the optimal hardware configuration for their specific requirements – based on strict SAP standards for data throughput, speed, and scalability.
Advantages not only for reading accesses –Currently RAM is still volatile, so that write operations in the main memory must be protected by a persistence layer, i.e. ultimately storage again. RAM is already well equipped today for read access, even to very large amounts of data, since computers with high RAM capacities (up to several TB) are available at reasonable prices with ever higher packing density of the memory elements and simultaneous price declines. Since reading from RAM is now possible efficiently and sensibly, SAP HANA and other in-memory technologies focus on reading applications such as reporting and business intelligence (OnLine Analytic Processing, OLAP). For transactional systems (OnLine Transaction Processing, OLTP), however, advantages can still be gained.
On the one hand, online reporting on transactional data is possible without performance losses in transaction processing, and on the other hand, code sections with a high communication volume between database and application can already benefit from a transfer to the database. But whether OLAP or OLTP, the In-Memory DB (IMDB) requires persistence because the data has disappeared from RAM by the time the computer is switched off.
Single node or scale-out? –Because if the code has to fetch the data from a neighboring node, there is communication effort between the nodes again. However, this is accompanied by comparatively high latency, as if the code had remained on the application server immediately. For this reason, a single-node implementation of HANA for OLTP is definitely preferable to a scale-out architecture.
At the same time, SAP previously insisted on HANA as a single node with the requirement for fast (internal) log devices. However, internal log devices are not acceptable for business-critical OLTP applications, since a loss of the computer or the log device also involves a loss of data. Business-critical data, especially the log data, should always be written (mirrored) to a second location so that in an emergency the database can be recovered from a second source to the last completed transaction.
Persistence Layer and Performance –Since data accesses to IMDBs are mainly done in RAM, one could expect that the storage plays a minor role as a persistence layer with regard to performance and primarily serves as a protection so that no data is lost. However, SAP’s requirements for persistence performance were and are in some cases higher than those for classic databases. Since data accesses to IMDBs are mainly done in RAM, one could expect that the storage plays a minor role as a persistence layer with regard to performance and primarily serves as a protection so that no data is lost. However, SAP’s requirements for persistence performance were and are in some cases higher than those for classic databases.
In general, two write mechanisms can be identified for databases – logwriter and datawriter. The logwriter documents each individual change (insert, update, delete) that is made to the database in a separate area in real time (synchronously). The Datawriter updates the changes of the tables in the storage from time to time (asynchronously) and ensures a consistent, but mostly not up-to-date (because asynchronous) image of the database. The logwriter is critical for transaction processing and for database recovery, if necessary. A transaction is not considered completed until the logwriter has reported it back as documented. Only then can the processing be continued. This ensures that after an unplanned termination of the database, the last valid state can be restored by updating the last consistent data image with the log entries not yet recorded there (roll forward).
Scale-out and hardware support –It remains to be discussed how a scale-out architecture can be evaluated in relation to a single node. Basically, for both OLTP and OLAP, the single node is the preferred alternative for the same database size, provided that the RAM capacities allow it.
There are two main reasons for this. The first has already been discussed during the discussion in connection with OLTP. The communication between the database nodes costs comparatively much time and has a negative effect on performance. In OLAP applications in particular, the problem of skilfully assigning code sections to the data is not as relevant as with OLTP, since queries can usually be processed well distributed by their mathematical structure. Nevertheless, the problem of latency remains because the partial results of a query have to be merged on a node and consolidated into a final result. A second problem arises, for example, with joins that go across tables that are distributed across several nodes. Before the join can be executed, the data of the tables involved must be transferred to the node on which the join is executed and stored temporarily. This costs both time and additional main memory. With a single node, data transfer and intermediate storage are not necessary, sice all data is local. It is therefore recommended that applications be served with a single node instance for as long as possible.
Support by the current hardware –The current developments in hardware technology accommodate this approach. With the hardware officially available in February 2014, it will be possible to install up to 12 terabytes of RAM in a Fujitsu machine. SAP is now announcing that it will support up to 6 terabytes on one computer for productive systems with the new hardware for OLTP applications and up to 2 terabytes for OLAP with eight equipped sockets compared to 1 terabyte in the past. This sounds plausible, since the CPU performance of the new processor generation has roughly doubled. However, the performance of HANA technology has also improved continuously and significantly in recent years, so that from a technical point of view one can imagine even greater RAM expansion than 2 terabytes for a node in a scale-out architecture in the future.
Where is the limit? – There these to be no real limit. One customer for example moved their four SAP BW systems, 300 TB in total, with the biggest one storing over 160 TB of data to SAP HANA without any problems.
Is it worth it? – This highly depends on your use-case. If you have time-critical processes the cost might compensated in a short amount of time. The major cost-driver for a SAP HANA deployment is actually the high cost of RAM compared to „normal“ disk-space.