Free Essay

No Sql Databases

In:

Submitted By vivekadithya
Words 4246
Pages 17
INSY 5337 Data Warehousing – Term Paper
NoSQL Databases: An Introduction and Comparison between Dynamo,
MongoDB and Cassandra
Authored ByNitin Shewale Aditya Kashyap Akshay Vadnere Vivek Adithya Aditya Trilok

Abstract
Data volumes have been growing exponentially in recent years, this increase in data across all the business domains have played a significant part in the analysis and structuring of data. NoSQL databases are becoming popular as more organizations consider it as a feasible option because of its schema-less structure along with its capability of handling BIG Data. In this paper, we talk about various types of NoSQL databases based on implementation perspective like key store, columnar and document oriented. This research paper covers the consolidated applied interpretation of NoSQL system, depending on the various database features like security, concurrency control, partitioning, replication, Read/Write implementation. We also would draw out comparisons among the popular products and recommend a particular NoSQL solution on the above mentioned factors.

1. Introduction
Until recently, Relational database systems have been on the forefront of data storage and management operations. The advent of mobile applications that requires real time analysis like
GPS based services, banking and social media has led to huge unstructured data being produced every second. Traditional RDBMS systems have found it difficult to cater to these huge chunks of unstructured data, as RDBMS mainly stores structured data in tabular format. Also, the unstructured data being mapped to a relational database results in increase in complexity as it uses expensive infrastructure to model the same. Also, even if the data model fits into SQL, platter of features provided by SQL becomes an overhead. Relational schema becomes a burden on applications which are trying to store data in multiple forms like videos, blogs and images etc.
A new methodology for data management was introduced for the management of unstructured data known as NoSQL (Not Only Structured Query Language).
NoSQL covers a broader topic of data structuring, storage and aggregation via various implementation approaches. It can store unstructured data and provide real time analysis to back up the web service applications. It gives up on conventional benchmarking of database management principles like Atomicity, Consistency, Isolation and Durability, to attain flexible data handling. Also, it provides inbuilt data partitioning and replication. Essentially, data across the business domains is governed by company policies and processes for data control and quality.
NoSQL moves away from these restrictions to promote performance and scalability requirements of particular application and services [1][2][3][4][10].

2. NoSQL Characteristics
Analogy of ACID properties in NoSQL is BASE, which is derived from CAP Theorem. CAP Theorem assures following database management standards –


Consistency – The given data should be available at all parts of the system at the same time. 

Availability – The data should be available any time and should provide a response every time.



Partition Tolerance – Total failure of the system should not be driven by failure of one section or partition of the system [7][8][9].

NoSQL database systems, like MongoDB and Cassandra, have strayed away from consistency to attain greater availability and efficient partitioning. This gave rise to systems driven on BASE principles. 

Basically Available – Data is distributed across various systems, hence the data is always available in one of the system, even if one of the systems fail.



Soft State – Since the data is distributed, there is no assurance of consistency.



Eventually Consistent – The data would be consistent eventually, even if it’s not at a given point in time [5][6][10].

3. Features of NoSQL


Flexible Data Models – NoSQL allows horizontal data partitioning across different distributed systems or processors. However, relational model has a fixed schema in

contrast to NoSQL. Applications based on NoSQL have data models explicitly designed and augmented for them.


Partial Record Updates – Data models that use NoSQL emphasize on column based processing that enable data aggregation on more than one attributes and entities.



Optimized MapReduce Processing – MapReduce, a native functionality for data movement and mapping is a part of NoSQL.



Horizontal Scalability – It allows on-the-fly addition of the processors with their own resources. Each node is fed with a subset of data to process, thus increasing the efficiency of the application. Horizontal scalability is more achievable in NoSQL data model as compared to RDBMS [1].

4. Types of NoSQL Databases



Key Value -> Key value data stores references the data using a unique key. The unique key acts as a link to the data that is randomly and independently stored on the disk.
Addition of new data values can be done without inflicting with existing data .Thus the key value stores are entirely schema less, the only structure that could possibly derived from the key stores is the combination of key value pairs. In this paper we discuss our findings on DynamoDB by Amazon.



Document -> Document data stores references a collection of uniquely identified keyvalue pairs known as Documents. Each document is recognized by its own unique ID in

the document collection. Document stores enables new documents to be stored with different kind of attributes. In this paper we discuss our findings on MongoDB


Column -> These are column centric data stores, where the indexing is done on every column. It provides efficient and high speed read-write operations. Any modification or addition of new data is stored using a timestamped version. We have introduced
Cassandra as an example for Column data store [1][10][12].

5. Comparative study between DynamoDB, Cassandra and MongoDB
5. A. DynamoDB
Dynamo was designed to provide a storage system within Amazon's platform that would be stubborn during unforeseen circumstances.
5. A.1 Key Features of Dynamo:


Key-Value Data Model - Data are represented as objects, and objects are determined based on unique keys. The operations supported on the data are get/put associated with the specified unique key.



Eventual Consistency - The primary objective of Dynamo is to be stubborn against unforeseen circumstances. However, it is a challenge to obtain such a consistency at the initial phase. Hence, consistency increases eventually while all the replicas are updated in a timely manner.



Symmetry and Decentralization - Every node had as much responsibility as the peers in
Dynamo. Every node is equally responsible for its peers in Dynamo. Thus the probability of failure is very low and the amount of manual intervention required would be very less
[24][26][27][28][29].

5. A.2 Operations:
Dynamo performs the following operations:


Get - To return the object associated with the key.



Put - To associate the object with the specified key.

5. A.3 Security:
Dynamo does not implement an efficient security mechanism, making it inefficient in handling scenarios that require authorization [24][26][27][28][29].

5. A.4 Partitioning:
Dynamo is a highly scalable system that can adapt to varying amounts of data by adding and removing nodes in a flexible manner. To implement partitioning, Dynamo uses a technique called consistent hashing, every node is allocated to one or more points on a fixed ring. Each data item is identified by a unique key. The data item is allocated to that specific node by hashing the key of the same. The output thus obtained is a point on the ring. Ring is rotated clockwise to identify the initial node. This derives an effective methodology of partitioning, as any deletion of node would have an impact only on their immediate members in the ring.

5. A.5 Replication:
The data is replicated on multiple hosts, thus resulting in supreme quality, reliability and durability. To implement replication, Dynamo replicates each data object at N nodes, where the value of N is set by the user. Coordinator node is allocated with a key, K, data associated with the
K node is stored locally. Also, the node replicates N-1 different nodes forming a ring
[24][26][27][28][29].
5. A.6 Storage:
Every specific node in Dynamo has its own persistence engine, this engine is used for storage as binary objects. Every instance uses its own unique persistence engine for storage. Few types of persistence engines used by instances are MySQL and Berkeley Database (BDB). The persistence engine makes use of pluggable components. The advantage of these pluggable engines is that users can choose the engine based on their requirements. For instance, BDB handles relatively small objects whereas MySQL can handle objects of large sizes [24][26][27][28][29].
5. A.7 Read/Write Implementation:
Dynamo implements a protocol that has two parameters R/W which represent the minimum number of nodes that must participate in a read/write operation. When a write operation is to be performed by the coordinator, it writes the data locally and then sends the write request to the other N-1 replica nodes. If a response is obtained for at least W-1 nodes, then the operation is said to be a success. Then, the coordinator informs the client.

When the coordinator is requested to perform a read operation, the coordinator sends a read request to the N-1 nodes. When there is a response from at least R-1 nodes, the result is returned to the client. If the objects received are different and if they are received from different nodes, a list of objects is sent by the coordinator to the client rather than a single object
[24][26][27][28][29].
5. A.8 Concurrency Control:
Shared objects are allowed access concurrently among multiple clients. Before all replica nodes are updated, write operations are returned. Hence, different versions of the same object may be returned. Such inconsistencies are handled effectively by Dynamo [24][26][27][28][29].

B. MongoDB
MongoDB is a document key store based product developed in C++. The indexing in case of
MongoDB is done using document key structure. It is a schema-less, performance and query optimization based product [12] [10].
5. B.1 Features of MongoDB


Flexibility during initial phases of development and design.



Horizontal scalability is infused an inbuilt feature in Mongo



User friendly tools to transfer data between different databases



Inter compatibility between implementation in various programming language [30].

5. B.2 Operations:
MongoDB allows these operations:


Insert – Adds new documents to a collection.



Find – Retrieves documents from a collection.



Update– Updates documents of a collection.



Remove – Removes a document from a collection [23][10].

5. B.3 Security
Since Mongo data files are unencrypted, they are prone to attacks. To lessen this, the application must actively encrypt every sensitive information before writing into the DB and also prevent unauthorized access. As Mongo uses java script for internal language, it is also prone to potential script injection attacks. Authentication is not provided in sharded clusters of Mongo DB [13].
5. B.4 Partitioning/ Sharding
Sharding enables segregating of data in numerous machines. MongoDB allows automated partitioning as a built in feature. This feature allows horizontal scaling across many processors
(nodes). Sharding combined with replication leads to availability of a highly mountable cluster.
For resource hungry applications, MongoDB creates cluster of shards, and balances the nodes, without impacting the original node [1] [10] [11] [13].
5. B.5 Storage
Data format used in storing data in Binary JSON (BSON) with a maximum size of 16MB. Data allocation is limited to 2GB per node in a 32 bit system. Data is mapped in-memory to increase

performance. Data is transferred to the disc after every minute by default, which is customizable
.Creation of new files is followed up by immediate flushing of data to disc, thus freeing up the memory [10][11] [14].

5. B.6 Replication
In MongoDB data replication is driven by Master-slave replication with various replica sets. Data is replicated in asynchronous form across servers. Read operations can be performed by multiple slave servers whereas write operation can be handled by only one server at a given point in time.
All the servers at a given point in time have a master server and a new master server is elected in case the previous one falters. Reading from multiple slave servers leads to eventual consistency, to achieve load balancing. The client has the ability to enforce the write operations the master server [10][11] [14] [13].
Replicas can be created in many ways in MongoDB catering to different needs of the application

Secondary Replicas - These replicas are not giving the opportunity to become a Master , they just store data



Hidden Replicas - These replicas are hidden from the application and cannot be elected as Master server. These replicas perform read only operations and are given voting rights to elect a new master server in case of failover.



Delayed - Delayed replicas are not synced with the master and will not have the updated data. 

Arbiters - These replicas are basically arbitrators and do not take part in any functionalities except communicating with other members and taking part in election [11] [14].

5. B.7 Read/Write Implementation
Indexing used in Mongo allows efficient read operations but effects negatively on the write/ insert operations. Mongo allows read operations on slave servers, write operations are controlled by master server. Data reading is performed by slave servers simultaneously in an asynchronous manner [14].
5. B.8 Concurrency Control:
Instant update on all the nodes is done on a MongoDB database system. Mongo DB does not support concurrency control. It exhibits eventual consistency. Data is sent out asynchronously to slave servers, thus it is not controlled.[11][12][13][14]

5. C. Cassandra
Cassandra was designed by Facebook to cater to humongous data needs of the organization.
Cassandra essentially vouches for two BASE features i.e. availability and scalability [21] It brings together the data structure of BigTable and high availability feature of Dynamo [25][11].

5. C.1 Features of Cassandra


Cassandra has multiple nodes in a cluster which are identical in terms of their software infrastructure. All the nodes are symmetric and does not need a master node. This feature allows linear scalability.



Hashing implemented for a new data value does not significantly impact the indexing maintained for other data values.



Interface provided by Cassandra is not easy to use for developers [13].

5. C.2 Operations/Read Write Implementation
1) Write -> Write function when executed by a client, it is captured by one of the nodes in the cluster randomly. This nodes then in turn writes the data to the cluster. The write action is then replicated on all the other nodes of the cluster via a Replication placement strategy. 2) Append -> after the write action being passed on to the individual nodes, change in data is proceeded to commit.
3) Update -> Update function modifies the main memory structure table with the update.
4) Read -> Client makes a read request to the random node in Cassandra, this node then identifies the node in the cluster holding the required data and then transfers the read request to that particular node.
[11][13][14]

5. C.3 Storage
Column based storage is the mainstay of storage system in Cassandra. Cassandra predominantly has one table as its primary operational unit. It also has a multidimensional map which is distributed and linked using keys. [19][20]Column families are defined in the initial phase of launching Cassandra, column families can be infinite. Specifications of at least some of the column families is mandatory. These families are further subdivided into columns and super columns. These can be added on runtime to the column families. Indexing of the columns can be done using the name which is being assigned to the column, they store numerous data values in each row. Similarly super columns are identified by their name and consist of multiple columns which are linked to super columns randomly [11][13]14].
5. C.4 Partitioning
Cassandra runs on nodes in a cluster which are symmetrical, hence the same data is distributed on all the nodes. Partitioning is done using two techniques i.e. Order-preserving partitioning and
Random partitioning. Order-preserving partitioning enables efficient execution of range queries but might cause issues in load-balancing. The nodes and their keys are evenly distributed in the cluster in both these techniques [11][13][14].
5. C.5 Replication
Replication of data is done on all the nodes of a cluster, data set is assigned to a particular node in the cluster. Data items are allocated to a spot in the node depending on the key of the data item, consistent hashing is used to identify the key of the data item. (24) Each data item has a

node coordinator which coordinates the replication of that data item to other nodes. Also client can choose the no of replicas that a particular data item can maintain [8] [11].
5. C.6 Concurrency Control
Cassandra enables Multi version concurrency control [8].
5. C.7 Security
Data files and the interactions between the client-database are unencrypted, as a result of which any user with access to file systems can extract the information he/she desires. Also, Intra cluster communication can be done freely whereas Inter cluster communication comes with a facility of authentication. Security in Cassandra is loosely implemented, IP addresses of the nodes of the cluster is the only info needed to sniff into the system [13][11].

6. Conclusion and Recommendation
We have compared three main products i.e. Dynamo, MongoDB and Cassandra on the basis of major features that drive the selection of a NoSQL product for any organization. MongoDB and
Cassandra are supersets of Dynamo, as they also are essentially implemented on the key-value pair indexing. Dynamo fails to maintain relatively similar attributes together, as can be done in
MongoDB through document linking. Also, horizontal scalability is better achieved in MongoDB and Cassandra than Dynamo.
Eventually, we have figured out that MongoDB and Cassandra are better products in terms of partitioning, replication and concurrency control than Dynamo.



When it comes to update operations, Cassandra is much faster than MongoDB and is independent of the size of the data.



Read operations in Cassandra are relatively fast than MongoDB for medium sized data sets, speed of read operations decline with increase in number of records.



Complex queries consisting of read and update operations simultaneously are better performed in Cassandra than MongoDB.



Symmetric node structure in cluster formation in Cassandra serves better concurrency control than Master slave structure in MongoDB.



Security in NoSQL systems is loosely implemented, comparatively Cassandra provides better authentication and authorization mechanisms than what we have in MongoDB
[8][11][14].

We would recommend Cassandra as an overall better product when compared on the basis of replication, concurrency control, Partitioning and Read/Write Implementation. Cassandra is tried and tested, it is being used by more than 1500 companies [25].

References
[1] NoSQL Systems for Big Data Management- 2014 IEEE 10th World Congress on Services- Venkat N
Gudivada Weisburg Division of Computer Science Marshall University Huntington, WV, USA gudivada@marshall.edu Dhana Rao Biological Sciences Department Marshall University Huntington,
WV, USA raod@marshall.edu Vijay V. Raghavan Center for Advanced Computer Studies University of
Louisiana at Lafayette Lafayette, LA, USA vijay@cacs.louisiana.edu

[2] R. Cattell, “Scalable sql and nosql data stores,” SIGMOD Rec., vol. 39, no. 4, pp. 12–27, May 2011.
[3] V. Benzaken, G. Castagna, K. Nguyen, and J. Siméon, “Static and dynamic semantics of NoSQL languages,” SIGPLAN Not., vol. 48, no. 1, pp. 101–114, Jan. 2013.
[4] F. Cruz, F. Maia, M. Matos, R. Oliveira, J. a. Paulo, J. Pereira, and R. Vilaça, “MeT: Workload aware elasticity for NoSQL, booktitle = Proceedings of the 8th ACM European Conference on Computer
Systems, series = EuroSys ’13, year = 2013, isbn = 978-1-4503-1994-2, location = Prague, Czech Republic, pages = 183–196, numpages = 14, publisher = ACM, address = New York, NY, USA.”
[5] REDUCE, YOU SAY: What NoSQL can do for Data Aggregation and BI in Large Repositories - 2011 22nd
International Workshop on Database and Expert Systems Applications - Laurent Bonnet1,2 , Anne
Laurent1 , Michel Sala1 1LIRMM Universite Montpellier 2 – CNRS ´ 161 rue Ada, 34095 Montpellier –
France name.surname@lirmm.fr Ben´ edicte Laurent ´ 2 2Namae Concept Cap Omega 34000
Montpellier – France b.laurent@namaconcept.com Nicolas Sicard3 3LRIE – EFREI 30-32 av. de la republique ´ 94 800 Villejuif – France nicolas.sicard@efrei.fr
[6] P. A. Bernstein and N. Goodman. Multiversion concurrency control – theory and algorithms. ACM
Trans. Database Syst., 8:465–483, December 1983.
[7] NoSQL Database: New Era of Databases for Big data Analytics - Classification, Characteristics and
Comparison - International Journal of Database Theory and Application Vol. 6, No. 4. 2013 - A B M
Moniruzzaman and Syed Akhter Hossain Department of Computer Science and Engineering Daffodil
International University abm.mzkhan@gmail.com, aktarhossain@daffodilvarsity.edu.bd
[8] NoSQL Databases: MongoDB vs Cassandra - Veronika Abramova Polytechnic Institute of Coimbra
ISEC - Coimbra Institute of Engineering Rua Pedro Nunes, 3030-199 Coimbra, Portugal Tel. ++351 239
790 200 a21190319@alunos.isec.pt Jorge Bernardino Polytechnic Institute of Coimbra ISEC - Coimbra

Institute of Engineering Rua Pedro Nunes, 3030-199 Coimbra, Portugal Tel. ++351 239 790 200 jorge@isec.pt [9] Jing Han; Haihong, E.; Guan Le; Jian Du, "Survey on NoSQL database," Pervasive Computing and
Applications (ICPCA), 2011 6th International Conference on , vol., no., pp.363,366, 26-28 Oct. 2011. doi:10.1109/ICPCA.2011.6106531. [10] NoSQL Evaluation A Use Case Oriented Survey - 2011 International Conference on Cloud and Service
Computing - Robin Hecht Chair ofApplied Computer Science IV University of Bayreuth Bayreuth,
Germany robin.hecht@uni -bayreuth.de, Stefan Jablonski Chair ofApplied Computer Science IV
University ofBayreuth Bayreuth, Germany stefan.jablonski@uni-bayreuth.de
[11] A Comparative Analysis of Different NoSQL Databases on Data Model, Query Model and Replication
Model-> Clarence J. M. Tauro1,∗, Baswanth Rao Patil2 and K. R. Prashanth3 - 1Christ University, Hosur
Road, Bangalore, India. 2Department of Computer Science, Christ University, Hosur Road, Bangalore,
India. 3Department of Computer Science, Christ University, Hosur Road, Bangalore, India. e-mail: clarence.tauro@res.christuniversity.in; baswanth.rao@cs.christuniversity.in; prashanth.r@cs.christuniversity.in [12] 2012 Third International Conference on Emerging Intelligent Data and Web Technologies MongoDB vs Oracle - database comparison - Alexandru Boicea, Florin Radulescu, Laura Ioana Agapin
Faculty of Automatic Control and Computer Science , Politehnical University of Bucharest,Bucharest,
Romania . alexandru.boicea@cs.pub.ro, florin.radulescu@cs.pub.ro, lauraioana.agapin@gmail.com
[13] Security Issues in NoSQL Databases - 2011 International Joint Conference of IEEE TrustCom-11/IEEE
ICESS-11/FCST-11 - Lior Okman Deutsche Telekom Laboratories at Ben-Gurion University, Beer-Sheva,
Israel, Nurit Gal-Oz, Yaron Gonen, Ehud Gudes Deutsche Telekom Laboratories at Ben-Gurion University, and Dept of Computer Science, Ben-Gurion University, Beer-Sheva, Israel, Jenny Abramov Deutsche

Telekom Laboratories at Ben-Gurion University and Dept of Information Systems Eng. Ben-Gurion
University, Beer-Sheva, Israel.
[14] NoSQL Databases: MongoDB vs Cassandra - Veronika Abramova Polytechnic Institute of Coimbra
ISEC - Coimbra Institute of Engineering Rua Pedro Nunes, 3030-199 Coimbra, Portugal Tel. ++351 239
790 200 a21190319@alunos.isec.pt, Jorge Bernardino Polytechnic Institute of Coimbra ISEC - Coimbra
Institute of Engineering Rua Pedro Nunes, 3030-199 Coimbra, Portugal Tel. ++351 239 790 200 jorge@isec.pt [15] E. Brewer. (2000, Jun.) Towards robust distributed systems. [Online]. Available: http://www.cs.berkeley.edu/ brewer/cs262b-2004/PODCkeynote.pdf
[16] S. Gilbert and N. Lynch, “Brewer’s conjecture and the feasibility of consistent, available, partitiontolerant web services,” SIGACT News, vol. 33, pp. 51–59, June 2002. [Online]. Available: http://doi.acm.org/10.1145/564585.56460 [17] Jing Han; Haihong, E.; Guan Le; Jian Du, "Survey on NoSQL database," Pervasive Computing and
Applications (ICPCA), 2011 6th International Conference on , vol., no., pp.363,366, 26-28 Oct. 2011. doi:10.1109/ICPCA.2011.6106531. [18] Tudorica, B.G.; Bucur, C., "A comparison between several NoSQL databases with comments and notes," Roedunet International Conference (RoEduNet), 2011 10th , vol., no., pp.1,5, 23-25 June 2011. doi:10.1109/RoEduNet.2011.5993686. [19] Lakshman, Avinash, Malik and Prashant, Cassandra – A Decentralized Structured Storage
System. In: SIGOPS Operating Systems Review, vol. 44, pp. 35–40, April (2010).
[20] Lakshman and Avinash, Cassandra – A structured storage system on a P2P Network. August
(2008).

[21] The apache software foundation, The Apache Cassandra Project (2011). http://cassandra.apache.org/, last accessed on January (2011).
[22] David Karger, Eric Lehman, Tom Leighton, Rina Panigrahy, Matthew Levine and Daniel
Lewin, Consistent hashing and random trees: distributed caching protocols for relieving hotspots on the WorldWideWeb .In Proceedings of the twenty-ninth annual ACM symposium on Theory of computing, STOC’97, pp. 654–663, New York, NY, USA (1997) ACM.
[23] https://www.mongodb.org/ - https://docs.mongodb.org/manual/
[24] https://aws.amazon.com/dynamodb/
[25] http://cassandra.apache.org/
[26] Dynamo and BigTable – Review and Comparison - 2041 IEEE 28-th Convention of Electrical and
Electronics Engineers in Israel - Grisha Weintraub Dept. of Mathematics and Computer Science The
Open University Raanana, Israel
[27] Neal Leavitt: Will NoSQL Databases Live Up to Their Promise?. IEEE Computer (COMPUTER)
43(2):12-14 (2010)
[28] G. DeCandia et a l.: Dynamo: amazon's highly available keyvalue store. SOSP 2007:205-220
[29] Rick Cattell: Scalable SQL and NoSQL data stores. SIGMOD Record (SIGMOD) 39(4):12-27 (2010)
[30] 2012 Third International Conference on Emerging Intelligent Data and Web Technologies MongoDB vs Oracle - database comparison- Alexandru Boicea, Florin Radulescu, Laura Ioana Agapin
Faculty of Automatic Control and Computer Science Politehnica University of Bucharest Bucharest,
Romania alexandru.boicea@cs.pub.ro, florin.radulescu@cs.pub.ro, lauraioana.agapin@gmail.com

Similar Documents

Premium Essay

Database - Sql

...day in class prior to the final exam. Questions 1. What is a database? Describe the 3 main types discussed in the book. * Is a collection of related data. * Doesn’t have to be electronic; example : card catalog’s in libraries ; spiral notebook ; phone or address book * Usually mean electronic database that run on computers 2. Why is the relational database model most commonly used today? What impact does it have on data integrity? * The idea came from Edgar F Codd in 1970 he worked for IBM at the time * Advantage is the tables define the relationships among themselves by mean of repeating an attribute or column from one table in another table “called Keys” * One of the problems was data redundancy; means to store the same data in more than place in the database * Codd’s theoretical design minimized redundancy * The impact is; the complexity of the design, its easy to create a hard to use database, or hard to fine the data you entered into the database. So a well-designed database = data integrity and flexibility. 3. Distinguish between data integrity and redundancy. * Data integrity = to the accuracy and quality of the data * Redundancy = refers to data that is repeated in multiple places in a database. 4. What is an entity? * An object of concern to a database, such as a customer or sale. Used in the logical design phase of a database. The main key that starts the gathering of data into one place. ...

Words: 2656 - Pages: 11

Premium Essay

Sql Database

...dwqwdlqkdmqwkldnqwkldnwkdnwqdldkwd Solutions To Workshop Exercises Chapter 1: SQL and Data 4 Chapter 2: SQL: The Basics 13 Chapter 3: The WHERE and ORDER BY Clauses 20 Chapter 4: Character, Number, and Miscellaneous Functions 27 Chapter 5: Date and Conversion Functions 42 Chapter 6: Aggregate Functions, GROUP BY and HAVING 57 Chapter 7: Equijoins 73 Chapter 8: Subqueries 108 Chapter 9: Set Operators 126 Chapter 10: Complex Joins 137 Chapter 11: Insert, Update, and Delete 164 Chapter 12: Create, Alter, and Drop Tables 178 Chapter 13: Indexes, Sequences, and Views 188 Chapter 14: The Data Dictionary, Scripting, and Reporting 197 Chapter 15: Security 213 Chapter 16: Regular Expressions and Hierarchical Queries 222 Chapter 17: Exploring Data Warehousing Features 235 Chapter 18: SQL Optimization 248 Chapter 1: SQL and Data In this chapter, you learned about data, how data is organized in tables, and how the relationships among the tables are depicted in a schema diagram. Based on your newly acquired knowledge, design a schema diagram based on the fictional ACME Construction Company. Draw on your own work experience to design the following components. 1. Draw boxes for these three tables: EMPLOYEE, POSITION, and DEPARTMENT. Solution: See the solution for Exercise 3. 2. Create at least three columns for each of the tables and designate a primary key for each table. Solution: See the solution for Exercise 3. 3. Create relationships among the...

Words: 6586 - Pages: 27

Premium Essay

Sql Database

...1. DBMS: DataBase Management System. 2. A shared integrated computer structure that houses a collection of related data. DBMS is the software product and Database is to create and maintain the software. 3. a) Handle all physical structure b) - Often used by multiple people (views) - Concurrency control (Accessing/ updating simultaneously) 4. A DBMS component that stores metadata- data about data. Thus the data dictionary contains the data definition as well as their characteristics and relationships. A data dictionary may also include data that are external to the DBMS. 5. - redundancy -promotes consistency -handles concurrency - data integrity 6. – Cost: product itself, more expensive hardware - Complexity: Only accessible via DBMS 7. a)Table: A matrix composed of intersecting rows and columns that represents an entity set in the relational model b) File: A collection of related records c) Record: A logically connected set of one or more fields that describes a person place or thing d) Row: where we record all the data e) Attribute: where we put the names and characteristics f) Field: A character of group of characters that has a specific meaning g) Columns: Where we put of the attributes 8. it will direct into lack of design and data modeling skills, also it will be a lot of data redundancy which is not good for the database. 9. a) Seven records b) Five fields c) -addresses are only from FL and TN ...

Words: 387 - Pages: 2

Free Essay

About Databases Sql

...Année universitaire Département Matière Enseignant Intitulé TD/TP : Durée 2012-2013 Informatique Bases de données avancées Haytham Elghazel TP JDBC/PLSQL 4h Année 4A   Ce   TP   est   à   réaliser   seul   ou   en   binôme   (trinômes   interdits).   Il   est   à   rendre   sur   spiral   (http://spiralconnect.univ-­‐lyon1.fr/spiral/spiral.html#/activities/goto_folder/1940969)   pour   le   01/12/2012,   23h30.   Il   faut   rendre   le   programme   (projet   maven   +   script   SQL   commenté   dans   un   fichier  zip).  Ne  pas  oublier  de  mentionner  les  deux  étudiants  du  binôme  dans  le  nom  du  fichier  zip  qui   sera   sous   la   forme  :   TP2_BDAV_Nom1_Prenom1_Nom2_Prenom2.zip.   Nom1_Prenom1   est   le   nom   et   le   prénom   du   premier   membre   du   binôme,   Nom2_Prenom2   est   le   nom   et   le   prénom   du   second   membre  du  binôme.   Le  non-­‐respect  de  ces  consignes  pourra  être  sanctionné  dans  la  note  de  ce  TP.     On  utilisera  le  schéma  relationnel  du  forum  de  discussion  mis  en  place  au  TP  précédent  auquel  il  faut   rajouter  la  ligne  suivante...

Words: 1924 - Pages: 8

Premium Essay

Basic Database Notes (Sql)

...Create Table CREATE TABLE PRESENTERS (PRESENTERID CHAR (6) PRIMARY KEY NOT NULL, PNRLNAME VARCHAR (20) NOT NULL, PNRFNAME VARCHAR (20) NOT NULL, GENDER CHAR (2) DEFAULT ‘M’, AGE SMALLINT NOT NULL CHECK (AGE>=18), YEARS SMALLINT NOT NULL, SALARY_YEARLY DECIMAL (10, 2) NOT NULL); CREATE TABLE EPISODES (EPISODENO INT IDENTITY (1,1) PRIMARY KEY NOT NULL, EPISODENAME VARCHAR (50) NOT NULL, DATEAIRED DATE NOT NULL, GUEST VARCHAR (50), COUNTRYAIRED VARCHAR (10) NOT NULL, PRESENTERID CHAR (6) FOREIGN KEY REFERENCES PRESENTERS (PRESENTERID), CARNO CHAR (7) FOREIGN KEY REFERENCES CARS (CARNO)); Insert Into Table INSERT INTO PRESENTERS (PRESENTERID, PNRLNAME, PNRFNAME, AGE, YEARS, SALARY_YEARLY) VALUES ('EMP_01','NEEDELL','TIFF', 61, 10, 374500.70), ('EMP_02','BUTLER-HENDERSON','VICKI', 41, 10, 24262.55), ('EMP_03','PLATO','JASON', 45, 8, 29100.98); Alter Tables Add Column ALTER TABLE EMPLOYEES ADD LAST_NAME VARCHAR (50) Datatypes ALTER TABLE CAKES ALTER COLUMN CAKENO CHAR (8) NOT NULL Constraints ALTER TABLE EMPLOYEES ADD CONSTRAINT PK1 PRIMARY KEY (EMPID) Foreign Keys ALTER TABLE EPISODES ADD FOREIGN KEY (CAKENO) REFERENCES CAKES (CAKENO) Change Column SP_RENAME 'TABLE_NAME'.'OLD_COLUMN_NAME', 'NEW_COLUMN_NAME', 'COLUMN'; Drop Column ALTER TABLE EMPLOYEES DROP COLUMN LAST_NAME; Delete Row DELETE FROM CARS WHERE CARMAKE='SUBARU WRX' Drop Table DROP TABLE EMPLOYEES Update Tables UPDATE PRESENTERS SET SALARY_YEARLY...

Words: 1804 - Pages: 8

Premium Essay

Distribution Channel

...Stored procedures Stored procedures can help improve application performance and reduce database access traffic. All database access must go across the network, which, in some cases, can result in poor performance. For each SQL statement, a database manager application must initiate a separate communication with DB2. To improve application performance, you can create stored procedures that run on your database server. A client application can then simply call the stored procedures to obtain results of the SQL statements that are contained in the procedure. Because the stored procedure runs the SQL statement on the server for you, database performance is improved. In addition, stored procedures can help to centralize business logic. If you make changes to a stored procedure, the changes are immediately available to all client applications that use it. Stored procedures are programs that have the following characteristics: • Contain procedural constructs with SQL statements • Are stored in databases and run on DB2 servers • Can be called by name by an application that is using SQL • Allow an application program to run in two parts: the application on the client and the stored procedure on the server The following figures show how two client applications access a database located on a database server. A client application that does not use stored procedures to access a database can require more network traffic. A client application that takes advantage of a stored...

Words: 2097 - Pages: 9

Premium Essay

Sql Injection.

...due to a SQL injection flaw in a web application that communicate with a database. Over ten years have passed since a famous hacker coined the term “SQL injection” and it is still considered one of the major application threats. A lot has been said on this vulnerability, but not all of the aspects and implications have been uncovered, yet. This paper aim is to collate some of the existing knowledge, introduce new techniques and demonstrate how to get complete control over the database management system's underlying operating system, file system and internal network through SQL injection vulnerability in over-looked and theoretically not exploitable scenarios. This paper also discuss about the prevention from the SQL Injection, not only in ORACLE but also in PHP, C#, JAVA and other languages. INDEX ABSTRACT………………………………………………………………………………….....02 INTRODUCTION……………….…………………………….…….………………………….04 BLIND SQL INJECTION…………………………………….………………………………..05 SQL INJECTION OVERVIEW…………………………….………………………………....06 CATEGORIES OF SQL INJECTION ATTACKS…………………………………………..07 WHAT’S VULNERABLE…………………………………………………………..…………08 WHAT’S NOT VULNERABLE…………………………………………………….………….08 SQL INJECTION METHODS……………………………………….……………….……….09 SQL MANIPULATION………………………………………………………..……………….09 CODE INJECTION……………………………………………………….……………………10 FUNCTION CALL INJECTION……………………………………………………………….11 BUFFER OVERFLOWS………………………………………………………………………13 SQL INJECTION TESTING METHODOLOGY………………………………………….....14 PREVENTING SQL INJECTION...

Words: 3449 - Pages: 14

Premium Essay

Pt2520 Unit 1 Exploring Programming Languages

...Languages “SQL is a tool for organizing, managing, and retrieving data stored by a computer database.”¹ This tool has been around for many decades and has evolved over time much like any of today’s technology has. Below is a chart found in Chapter 3 of SQL: The Complete Reference, Third Edition showing the milestones in the development of SQL: Milestones in SQL Development | Year | Event | 1970 | Codd defines relational database model | 1974 | IBM begins System/R project | 1974 | First article describing the SEQUEL language is published | 1978 | System/R customer tests are conducted | 1979 | Oracle introduces first commercial RDBMS | 1981 | Relational Technology introduces Ingres | 1981 | IBM announces SQL/DS | 1982 | ANSI forms SQL standards committee | 1983 | IBM announces DB2 | 1986 | ANSI SQL1 standard is ratified | 1986 | Sybase introduces RDBMS for transaction processing | 1987 | ISO SQL1 standard is ratified | 1988 | Ashton-Tate and Microsoft announce SQL Server for OS/2 | 1989 | First TPC benchmark (TPC-A) is published | 1990 | TPC-B benchmark is published | 1991 | SQL Access Group database access specification is published | 1992 | Microsoft publishes ODBC specification | 1992 | ANSI SQL2 standard (SQL-92) is ratified | 1992 | TPC-C (OLTP) benchmark is published | 1993 | Specialized SQL data warehousing systems are shipped for the first time | 1993 | ODBC products are shipped for the first time | 1994 | Parallel database server...

Words: 1281 - Pages: 6

Premium Essay

Pos410 R12 Course Syllabus

... | | |SQL for Business | Copyright © 2010, 2009, 2008, 2007, 2006 by University of Phoenix. All rights reserved. Course Description This course covers Structured Query Language (SQL) that provides a unified language that lets you query, manipulate, or control data in a business applications environment. Policies Faculty and students/learners will be held responsible for understanding and adhering to all policies contained within the following two documents: University policies: You must be logged into the student website to view this document. Instructor policies: This document is posted in the Course Materials forum. University policies are subject to change. Be sure to read the policies at the beginning of each class. Policies may be slightly different depending on the modality in which you attend class. If you have recently changed modalities, read the policies governing your current class modality. Course Materials Vieira, R. (2009). Beginning Microsoft SQL Server 2008 Programming. Indianapolis, IN: Wiley Publishing, Inc. Software Microsoft SQL Severer 2008 (Virtual Desktop) Supplemental Resource Rob, P., & Coronel, C. (2009). Database systems: Design, implementation, and management (8th ed.). Boston: Course Technology. Article References Barnett,...

Words: 2401 - Pages: 10

Free Essay

Distributed Systems

...Server Training (16 Courses) Training on how to build and manage SQL Server databases. Our SQL Server Training Courses provide the skills needed to build a solid foundation for SQL Server development. Introduction An overview of DBMS technology * How data is accessed, organized and stored * The database development process * Query and application development tools * CASE tools for database analysis and design * Tables, attributes and relationships * Primary and foreign keys * Relational integrity constraints * Manipulating data: selection, projection, join, union, intersection, difference * An integrated, active data dictionary * The query optimizer * Developing the logical data model * Mapping the data model to the relational model * Specifying integrity constraints * Defining the data in the data dictionary * Capturing entities, attributes and identifiers * Describing relationships: one-to-one, one-to-many, many-to-many * Optional and mandatory relationships * Resolving many-to-many relationships for implementation * Generating the SQL to build the database * Reverse engineering to capture the design of an existing database * SQL Programming Language Introduction 1 Days * Write SQL code based on ANSI/ISO standards to build Microsoft SQL Server or Oracle database structures * Update database content with SQL and transaction handling * Retrieve data with filter conditions and from...

Words: 1010 - Pages: 5

Free Essay

Sql Server Security Best Practise

...SQL Server 2012 Security Best Practices - Operational and Administrative Tasks SQL Server White Paper Author: Bob Beauchemin, SQLskills Technical Reviewers: Darmadi Komo, Jack Richins, Devendra Tiwari Published: January 2012 Applies to: SQL Server 2012 and SQL Server 2014 Summary: Security is a crucial part of any mission-critical application. This paper describes best practices for setting up and maintaining security in SQL Server 2012. Copyright The information contained in this document represents the current view of Microsoft Corporation on the issues discussed as of the date of publication. Because Microsoft must respond to changing market conditions, it should not be interpreted to be a commitment on the part of Microsoft, and Microsoft cannot guarantee the accuracy of any information presented after the date of publication. This white paper is for informational purposes only. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED, OR STATUTORY, AS TO THE INFORMATION IN THIS DOCUMENT. Complying with all applicable copyright laws is the responsibility of the user. Without limiting the rights under copyright, no part of this document may be reproduced, stored in, or introduced into a retrieval system, or transmitted in any form or by any means (electronic, mechanical, photocopying, recording, or otherwise), or for any purpose, without the express written permission of Microsoft Corporation. Microsoft may have patents, patent applications, trademarks, copyrights, or other intellectual...

Words: 15647 - Pages: 63

Free Essay

Database Design

...Database Design DBM502 – Database Management March 24th, 2014 University of Phoenix Professor: Sam Haidar EzStream This paper will provide an overview of the database to be utilized for the startup company EzStream Inc. The core business of EzStream is to provide a complete solution to stream media content via online or WIFI. Customers will have the choice to rent, buy, or pay a monthly subscription to watch media content. Several components will break down the development of the database and provide details to the database infrastructure. Conceptual Design The conceptual design of EzStream’s DB will consist of Movies, Suppliers, and Customers. Customers will either rent or purchase movies, and have the option of paying a monthly subscription rate to watch movies via digital streaming. Data Analysis and Requirements * Tasks during Research and Analysis: * Identify essential "real world" information (e.g. interviews) * Remove redundant, unimportant details * Clarify unclear natural language statements * Fill remaining gaps in discussions * Distinguish data and operations Requirement Analysis First step: filter essential information vs unimportant details * Essentials * There are customers, suppliers, and media content * Identify age of audience for rentals * Customers have a customer identification number * Four weeks maximal rental time. * Unimportant details * "...Rentals since a few...

Words: 1876 - Pages: 8

Premium Essay

Nt1330 Unit 3 Assignment 1

...Verify that the database backups are stored in a secured location. Otherwise attackers can able to get and access the database backups. Check that the SQL Server databases have limited access. So that we can reduce the number of persons who could access the database. Check the Access database engine files format. It should be in some format. Otherwise malicious users can access the SQL Server files then they may damage it. Check that the file system permissions of audit files. It should be on limited format. Otherwise it could be damaged by attackers. Verify that the database backup’s private and public encryption keys stored in a secured location for safety purpose. Verify that the SQL Server computer protected by a firewall. Check that the server and client operating...

Words: 509 - Pages: 3

Premium Essay

A Survey of Sql Injection Defense Mechanisms

...A Survey of SQL Injection Defense Mechanisms Kasra Amirtahmasebi, Seyed Reza Jalalinia and Saghar Khadem Chalmers University of Technology, Sweden akasra, seyedj, saghar{@student.chalmers.se} Abstract SQL Injection Attack (SQLIA) is a prevalent method which makes it possible for the attackers to gain direct access to the database and culminates in extracting sensitive information from the firm’s database. In this survey, we have presented and analyzed six different SQL Injection prevention techniques which can be used for securing the data storage over the Internet. The survey starts by presenting Variable Normalization and will continue with AMNESIA, Prepared statements, SQL DOM, SQLrand and SQLIA prevention in stored procedures respectively. that determining whether a SQL statement is allowable or not is done by checking the existence of normalized statement in the ready-sorted allowable list. 2.1. Background Many web pages ask users to input some data and make a SQL queries to the database based on the information received from the user i.e. username and passwords. By sending crafted input a malicious user can change the SQL statement structure and execute arbitrary SQL commands on the vulnerable system. Consider the following username and password example, in order to login to the web site, the user inputs his username and password, by clicking on the submit button the following SQL query is generated: SELECT * FROM user_table WHERE user_id = ‘john’ and password...

Words: 5643 - Pages: 23

Premium Essay

Databases in Use

...Databases in use:- 1. Mysql What is MS SQL? It is a relational web hosting database that is used to store web site information like blog posts or user information. MS SQL is the most popular type of database on Windows servers. It is not free but it has many advanced features that make it suitable for businesses. What is MS SQL capable of? In basic terms, an MS SQL database is capable of storing any type of that you want. It will let you quickly store and retrieve information and multiple web site visitors can use it at one time. You will use SQL statements to accomplish all of this. In more technical terms, most versions of MS SQL have the following features: • Buffer management • Logging and Transaction • Concurrency and locking • Replication services • Analysis services • Notification services • Integration services • Full text search service • Stored procedures • Triggers • Views • Sub-SELECTs (i.e. nested SELECTs) What is MS SQL used for? MS SQL is the database of choice for web applications on a Windows platform (using .NET or ASP). These languages make is extremely easy to connect to a MS SQL database. It is also used for many popular content management systems and other scripts. Learn more about what is MS SQL on its official website. They have comprehensive learning material. You should also check out the SQL tutorial on W3Schools and some articles on the SQL Community Server blog. If you are looking for a web host that supports MS SQL, look at...

Words: 2647 - Pages: 11