banner

Concurrency versus consistency in NoSQL databases

Sonal Kanungo, Rustom D. Morena

Abstract


With the advent of cloud services, the proliferation of data has reached unprecedented levels. The load distribution across multiple servers, driven by web and mobile applications, has become a defining characteristic of contemporary data management. In contrast to this surge in data complexity, traditional relational databases have proven inadequate in handling vast amounts of unstructured data due to their inherent focus on structured data models. Additionally, the concept of clustering, vital for efficient unstructured data management, eluded relational databases, rendering them ill-equipped for customized clustering techniques and the optimal execution of queries. SQL (Structured Query Language) databases earlier emerged as a groundbreaking solution, introducing the relational database model that organized data into structured tables. They employed ACID (atomicity, consistency, isolation, durability) properties to maintain data integrity and enabled intricate querying through SQL. However, as applications grew in complexity, SQL databases encountered hurdles in handling various data types, rapid data expansion, and concurrent workloads. The limitations of SQL databases propelled the rise of NoSQL (Not Only Structured Query Language) databases, which prioritized adaptability, scalability, and performance. NoSQL databases embraced diverse data models such as documents, key-values, column families, and graphs, enabling effective management of structured, semi-structured, and unstructured data. The transition to NoSQL databases was justified by several factors; horizontally scaled across nodes, handling extensive read-write operations effectively, Agile development of accommodating changing data structures without schema constraints, optimization for specific tasks, providing low-latency access and high throughput, dynamic schemas aligned with modern iterative development, promoting adaptability, and adeptly managed diverse data types, spanning text, geospatial, time-series, and multimedia data. These databases are purposefully designed to accommodate the escalating demands of data storage. Notably, this data emanates from diverse nodes and is susceptible to concurrent access by numerous users. However, a critical challenge surfaces as the data present on one node may diverge from its counterpart on another node replica. In this context, the simultaneous execution of database operations, while preserving the integrity of the data, emerges as a pivotal concern. Maintaining data consistency amid concurrent access hinges upon the synchronization of operations across all replica nodes. Achieving this synchronization necessitates the adoption of a robust concurrency control technique. Concurrency control acts as the linchpin for upholding accuracy and reliability within a system where operations unfold concurrently. Hence, the focal point of this investigation lies in examining the assorted concurrency control methodologies employed by NoSQL systems. The objective is to dissect the intricate interplay between concurrency and consistency, shedding light on the strategies these systems employ to strike an optimal balance between the two. In summation, as the landscape of data management witnesses an era of exponential growth catalyzed by cloud services, the dynamics of load distribution and unstructured data have necessitated a departure from traditional relational databases. NoSQL databases have risen to the fore, demonstrating the ability to grapple with these challenges. However, the quest for concurrent data access without compromising data consistency propels the exploration of various concurrency control methods. The aim of this study is to look at some of the different concurrency control approaches employed by NoSQL systems, highlighting how they priorities concurrency and consistency.


Keywords


multiversion; NoSQL; locking; optimistic; MongoDB; Cassandra; DynamoDB

Full Text:

PDF

References


1. Karamolegkos P, Mavrogiorgou A, Kiourtis A, Kyriazis D. EverAnalyzer: A self-adjustable big data management platform exploiting the Hadoop ecosystem. Information 2023; 14(2): 93. doi: 10.3390/info14020093

2. Poorvadevi R, Rajalakshmi S. Preventive signature model for secure cloud deployment through fuzzy data array computation. ICTACT Journal on Data Science and Machine Learning 2017; 7(2): 1402–1407. doi: 10.21917/ijsc.2017.0194

3. Abramova V, Bernardino J. NoSQL databases: MongoDB vs Cassandra. In: Proceedings of the Conference C3S2E; 10–12 July 2013; Porto, Portugal. pp. 14–22.

4. Gilbert S, Lynch N. Brewer’s conjecture and the feasibility of consistent available partition-tolerant web services. ACM SIGACT News 2002; 33(2): 51–59. doi: 10.1145/564585.564601

5. Grolinger K, Higashino WA, Tiwari A, Capretz MA. Data management in cloud environments: NoSQL and NewSQL data stores. Journal of Cloud Computing: Advance System and Applications 2013; 2(1): 2–24. doi: 10.1186/2192-113x-2-22

6. Eswaran KP, Gray JN, Lorie RA, Traiger IL. The notions of consistency and predicate locks in a database system. Communications of the ACM 1976; 19(11): 624–633. doi: 10.1145/360363.360369

7. Kanungo S, Morena RD. Comparison of concurrency control and deadlock handing in different OODBMS. International Journal of Engineering Research and Technology 2016; V5(5): 492–498. doi: 10.17577/ijertv5is050615

8. Reed DP. Implementing atomic actions on decentralized data. ACM Transactions on Computer Systems 1983; 1(1): 3–23. doi: 10.1145/357353.357355

9. Kung HT, Robinson JT. On optimistic methods for concurrency control. ACM Transactions on Database Systems 1981; 6(2): 213–226. doi: 10.1145/319566.319567

10. Papadimitriou CH, Kanellakis PC. On concurrency control by multiple versions. ACM Transactions on Database Systems 1984; 9(1): 89–99. doi: 10.1145/348.318588

11. Bernstein PA, Goodman N. Multiversion concurrency control-theory and algorithms. ACM Transactions on Database Systems 1983; 8(4): 465–483. doi: 10.1145/319996.319998

12. Kanungo S, Morena RD. Analysis and comparison of concurrency control techniques. International Journal of Advanced Research in Computer and Communication Engineering 2015; 4(3): 245–251. doi: 10.17148/ijarcce.2015.4360

13. Kanungo S, Morena RD. Evaluation of multiversion concurrency control algorithms. International Journal of Research in Electronics and Computer Engineering 2018; 6(3): 807–813.

14. Kanungo S, Morena RD. Effective correctness criteria for serializability in multiversion concurrency control technique. International Journal of Innovative Technology and Exploring Engineering (IJITEE) 2019; 8(12): 1674–1653. doi: 10.35940/ijitee.l3162.1081219

15. Kanungo S, Morena RD. Issues with concurrency control techniques. International Journal of Electrical Electronics & Computer Science Engineering 2017; 1–6.

16. Lee SY, Liou RL. A multi-granularity locking model for concurrency control in object-oriented database systems. IEEE Transactions on Knowledge and Data Engineering 1996; 8(1): 144–156. doi: 10.1109/69.485643

17. Lamport L. Paxos made simple. ACM SIGACT News 2001; 32(4): 51–58.

18. Cattell R. Scalable SQL and NoSQL data stores. SIGMOD Record 2011; 39(4): 12–27. doi: 10.1145/1978915.1978919

19. Lotfy AE, Saleh AI, El-Ghareeb HA, Ali HA. A middle layer solution to support ACID properties for NoSQL databases. Journal of King Saud University—Computer and Information Sciences 2016; 28(1): 133–145. doi: 10.1016/j.jksuci.2015.05.003

20. Kudo T, Ishino M, Saotome K, Kataoka N. A proposal of transaction processing method for MongoDB. Procedia Computer Science 2016; 96: 801–810. doi: 10.1016/j.procs.2016.08.251

21. Schultz W, Avitabile T, Cabral A. Tunable consistency in MongoDB. Proceedings of the VLDB Endowment 2019; 12(12): 2072–2082. doi: 10.14778/3352063.3352125

22. Concurrency. Available online: https://www.mongodb.com/docs/manual/faq/concurrency/ (accessed on 24 August 2023).

23. Seguin K. The little MongoDB Book. Openmymind.net; 2011.

24. Brewer EA. Towards robust distributed systems. In: Proceedings of the Nineteenth Annual ACM Symposium on Principles of Distributed Computing; 16–19 July 2000; Portland, Oregon, USA. pp. 1–7.

25. About Transactions and Concurrency control (2020) from Datastax. Available online: http://www.datastax.com/documentation/cassandra/2.0/cassandra/dml/dml_ltwt_transaction_c.html (accessed on 24 August 2023).

26. Amazon DynamoDB, Developer Guide (2012). API Version 2012-08-10. Available online: https://s3.cn-north-1.amazonaws.com.cn/aws-dam-prod/china/pdf/dynamodb-dg.pdf (accessed on 24 August 2023).

27. Amazon DynamoDB transactions: How it works. Available online: https://docs.aws.Amazon.com/amazondynamodb/latest/developerguide/transaction-apis.html (accessed on 24 August 2023).

28. Building distributed locks with the DynamoDB lock client. Available online: https://aws.amazon.com/blogs/database/building-distributed-locks-with-the-dynamodb-lock-client/ (accessed on 24 August 2023).

29. Gray J, Lamport L. Consensus on transaction commit. ACM Transactions on Database Systems 2004; 31(1): 133–160. doi: 10.1145/1132863.1132867

30. Khan W, Kumar T, Zhang C, Raj K, et al. SQL and NoSQL database software architecture performance analysis and assessments—A systematic literature review. Big Data and Cognitive Computing 2023; 7(2): 97. doi: 10.3390/bdcc7020097

31. Han J, Haihong E, Le G, Du J. Survey on NoSQL database. In: Proceedings of the 6th International Conference on Pervasive Computing and Applications; 26–28 October 2011; Port Elizabeth. pp. 363–366.

32. Chen JK, Lee WZ. An Introduction of NoSQL databases based on their categories and application industries. Algorithms 2019; 12(5): 106. doi: 10.3390/a12050106

33. Apache Cassandra documentation v4.0-beta3. Available online: https://cassandra.apache.org/doc/ (accessed on 24 August 2023).

34. How are consistent read and write operations handled? Available online: https://docs.datastax.com/en/archived/cassandra/3.0/cassandra/dml/dmlAboutDataConsistency.html (accessed on 24 August 2023).

35. Tunable consistency from Datastax. Available online: How Cassandra Balances Consistency & Performance | DataStax (accessed on 24 August 2023).




DOI: https://doi.org/10.32629/jai.v7i3.936

Refbacks

  • There are currently no refbacks.


Copyright (c) 2023 Sonal Kanungo, Rustom D. Morena

License URL: https://creativecommons.org/licenses/by-nc/4.0/