##plugins.themes.bootstrap3.article.main##

  •   Eman A. Khashan

  •   Ali I. El Desouky

  •   Sally M. Elghamrawy

Abstract

The increasing of data on the web poses major confrontations. The amount of stored data and query data sources have become needful features for huge data systems. There are a large number of platforms used to handle the NoSQL database model such as: Spark, H2O and Hadoop HDFS / MapReduce, which are suitable for controlling and managing the amount of big data. Developers of different applications impose data stores on difficult tasks by interacting with mixed data models through different APIs and queries. In this paper, a complex SQL Query and NoSQL (CQNS) framework that acts as an interpreter sends complex queries received from any data store to its corresponding executable engine called CQNS. The proposed framework supports application queries and database transformation at the same time, which in turn speeds up the process. Moreover, CQNS handles many NoSQL databases like MongoDB and Cassandra. This paper provides a spark framework that can handle SQL and NoSQL databases. This work also examines the importance of MongoDB block sharding and composition. Cassandra database deals with two types of sections vertex and edge Portioning. The four scenarios criteria datasets are used to evaluate the proposed CQNS to query the various NOSQL databases in terms of optimization performance and timing of query execution. The results show that among the comparative system, CQNS achieves optimum latency and productivity in less time.

Keywords: NoSQL, Query processing, Querying, Hadoop HDFS, Spark Mongo, Spark connector, parallel k-means, clustering, query optimization, Cassandra, partitioning

References

R. Sellami, B. Defude, “Complex Queries Optimization and Evaluation Over Relational and NoSQL Data Stores in Cloud Environments,” Ph.D. dissertation, University of Paris-Saclay, France, 2017. [Online].

P. Sangat, M. Indrawan-Santiago, D. Taniar, Sensor data management in the cloud: Data storage, data ingestion, and data retrieval, Concurrency Computat: Pract Exper. 2018; 30: e4354., 2017. [Online]. Available: https://doi.org/10.1002/cpe.4354.

Baruffa, G., Femminella, M., Pergolesi, M., & Reali, G.: Comparison of MongoDB and Cassandra Databases for supporting Open-Source Platforms tailored to Spectrum Monitoring as-a-Service. IEEE Transactions on Network and Service Management (2019).?

Khan, Y., Zimmermann, A., Jha, A., Gadepally, V., D’Aquin, M., & Sahay, R.: One size does not fit all: querying web polystores. Ieee Access, 7, 9598-9617 (2019).?

Duggan, J., et al.: The BigDAWG polystore system. SIGMOD Rec. 44(2), 11–16 (2015)

Xiang Li, Zhiyi Ma, Hongjie Chen, “QODM: A query-oriented data modeling approach for NoSQL databases,” 2014 IEEE Workshop on Advanced Research and Technology in Industry Applications (WARTIA). [Online].

J. Roijackers and G. H. L. Fletcher, “On bridging relational and document-centric data stores,” in Big Data - 29th British National Conference on Databases, BNCOD’13, 2013, pp. 135–148.

Sharma, M., Sharma, V. D., & Bundele, M. M. (2018, November). Performance Analysis of RDBMS and No SQL Databases: PostgreSQL, MongoDB and Neo4j. In 2018 3rd International Conference and Workshops on Recent Advances and Innovations in Engineering (ICRAIE) (pp. 1-5). IEEE.?

M. Armbrust and et al., “Spark SQL: relational data processing in spark,” in Proceedings of the 2015 ACM SIGMOD International Conference on Management of Data, Melbourne, Victoria, Australia, May 31 - June 4, 2015, 2015, pp. 1383–1394.

H. Garcia-Molina and et al., “The TSIMMIS approach to mediation: Data models and languages,” J. Intell. Inf. Syst., vol. 8, no. 2, pp. 117–132, 1997.

IBM, “Ibm nosql: Ibm informix - introducing nosql capabilities a technical white paper,” Tech. Rep., November 2013.

S. K Pandey, Sudhakar, Context based Cassandra query language , 2017 8th International Conference on Computing, Communication and Networking Technologies (ICCCNT).

Aaron Schram and Kenneth M. Anderson., “MySQL to NoSQL: data modeling challenges in supporting scalability Tucson, Arizona, USA — October 19 - 26, 2012.

Ferro, M., Fragoso, R., & Fidalgo, R. (2019, July). Document-Oriented Geospatial Data Warehouse: An Experimental Evaluation of SOLAP Queries. In 2019 IEEE 21st Conference on Business Informatics (CBI) (Vol. 1, pp. 47-56). IEEE.?

Song, J., He, H., Thomas, R., Bao, Y., & Yu, G. (2019). Haery: a Hadoop based Query System on Accumulative and High-dimensional Data Model for Big Data. IEEE Transactions on Knowledge and Data Engineering.?

Samanta, A. K., Sarkar, B. B., & Chaki, N. (2018, November). Query Performance Analysis of NoSQL and Big Data. In 2018 Fourth International Conference on Research in Computational Intelligence and Nasholm, Petter. "Extracting data from NoSQL databases." University of Gothenburg, Gothunburg (2012).?

Communication Networks (ICRCICN) (pp. 237-241). IEEE.?

Agarwal, S. & Rajan, K.S; “Performance analysis of MongoDB versus PostGIS/PostGreSQL databases for line intersection and point containment spatial queries”; vol. 24(6), pp. 671–677, Springer; doi: https://doi.org/10.1007/s41324-016-0059-1.

J.M. Patel, "Operational NoSQL Systems: What's New and What's Next?", Computer, vol. 49, no. 4, pp. 23-30, Apr. 2016.

Elghamrawy, S.M. and Hassanien, A.E., 2017. A partitioning framework for Cassandra NoSQL database using Rendezvous hashing. The Journal of Supercomputing, 73(10), pp.4444-4465.

Dipietro, Salvatore, Rajkumar Buyya, and Giuliano Casale. "PAX: Partition-aware autoscaling for the Cassandra NoSQL database." NOMS 2018-2018 IEEE/IFIP Network Operations and Management Symposium. IEEE, 2018.?

Vasavi, S., M. Padma Priya, and Anu A. Gokhale. "Framework for Geospatial Query Processing by Integrating Cassandra with Hadoop." Knowledge Computing and Its Applications. Springer, Singapore, 2018. 131-160.?

Mearaj, I., Maheshwari, P., & Kaur, M. J. (2018, November). Data Conversion from Traditional Relational Database to MongoDB using XAMPP and NoSQL. In 2018 Fifth HCT Information Technology Trends (ITT) (pp. 94-98). IEEE.?

Zhang, D., Wang, Y., Liu, Z., & Dai, S. (2019). Improving NoSQL Storage Schema Based on Z-Curve for Spatial Vector Data. IEEE Access, 7, 78817-78829.?

Yassine, F., & Awad, M. A. (2018, November). Migrating from SQL to NOSQL Database: Practices and Analysis. In 2018 International Conference on Innovations in Information Technology (IIT) (pp. 58-62). IEEE.?

Gunawan, R., Rahmatulloh, A., & Darmawan, I. (2019, July). Performance Evaluation of Query Response Time in The Document Stored NoSQL Database. In 2019 16th International Conference on Quality in Research (QIR): International Symposium on Electrical and Computer Engineering (pp. 1-6). IEEE.?

Pratama, F. A., & Mutijarsa, K. (2018, October). Query Support for Data Processing and Analysis on Ethereum Blockchain. In 2018 International Symposium on Electronics and Smart Devices (ISESD) (pp. 1-5). IEEE.?

Abbas, Zainab, et al. "Streaming graph partitioning: an experimental study. " Proceedings of the VLDB Endowment 11.11 (2018): 1590-1603.?

R. Sellami, S. Bhiri, and B. Defude, “Supporting multi data stores applications in cloud environments,” IEEE Trans. Services Computing, vol. 9, no. 1, pp. 59–71, 2016.

R. Sellami, “Supporting multiple data stores-based applications in cloud environments,” Ph.D. dissertation, University of Paris-Saclay, France, 2016. [Online]. Available: https://tel. archives-ouvertes.fr/tel-01280236.

R. Sellami and B. Defude, “Using multiple data stores in the cloud: Challenges and solutions,” in Data Management in Cloud, Grid andP2P Systems - 6th International Conference, Globe 2013, Prague, Czech Republic, August 28-29, 2013. Proceedings, 2013, pp. 87–98.

R. Sellami, S. Bhiri, and B. Defude, “ODBAPI: A unified REST API for relational and nosql data stores,” in 2014 IEEE International Congress on Big Data, Anchorage, AK, USA, June 27 - July 2, 2014, 2014, pp. 653–660.

R. Sellami and et al., “Automating resources discovery for multiple data stores cloud applications,” in CLOSER 2015 – Proceedings of the 5th International Conference on Cloud Computing and Services Science, Lisbon, Portugal, 20-22 May, 2015., 2015, pp. 397–405.

https://docs.mongodb.com/manual/core/sharded-cluster-requirements/

https://github.com/johnnywidth/cql-calculator

Abdel-Hamid, N.B., ElGhamrawy, S., El Desouky, A. and Arafat, H., 2018. A dynamic spark-based classification framework for imbalanced big data. Journal of Grid Computing, 16(4), pp.607-626.

Elghamrawy, S.M., 2016, October. An adaptive load-balanced partitioning module in Cassandra using rendezvous hashing. In International Conference on Advanced Intelligent Systems and Informatics (pp. 587-597). Springer, Cham.

Downloads

Download data is not yet available.

##plugins.themes.bootstrap3.article.details##

How to Cite
[1]
Khashan, E.A., El Desouky, A.I. and Elghamrawy, S.M. 2020. A Framework for Executing Complex Querying for Relational and NoSQL Databases (CQNS). European Journal of Electrical Engineering and Computer Science. 4, 5 (Sep. 2020). DOI:https://doi.org/10.24018/ejece.2020.4.5.195.