New options for running the Oracle Database on EC2 using SSD storage for your database. The I2 instances feature the latest generation of Intel Ivy Bridge processors - each virtual CPU (vCPU) is a hardware hyperthread from an Intel Xeon E5-2670 v2 (Ivy Bridge) processor. I2 instances are available in four sizes as listed in the table below.
Instance Type vCPUs ECU Rating Memory (GiB) Instance Storage SSD (GB) Note i2.xlarge 4 14 30.5 1 x 800 800 GB of SSD i2.2xlarge 8 27 61 2 x 800 1.6 TB of SSD i2.4xlarge 16 53 122 4 x 800 3.2 TB of SSD. hi.4xlarge with 2 TB of SSD was previously largest SSD instance type i2.8xlarge 32 104 244 8 x 800 6.4 TB of SSD
The i2.8xlarge instance size is capable of providing over 365,000 4 kilobyte (KB) random read IOPS and over 315,000 4 KB random write IOPS. Great for Oracle databases that have high IO requirements. This document has more on configuring Oracle Databases on EC2 and using SSD based instances : http://cloudconclave.blogspot.com/2013/11/aws-database-reference-implementation.html
Blog posts to help enterprises run applications in the cloud. Entries on cloud migrations as Fortune 1000 companies embark on migrating to the cloud.
Friday, December 20, 2013
Oracle Database on SSD
Wednesday, December 11, 2013
Redshift data loading basics
Here are some basics when getting started with loading data into Redshift:
1. S3 command to load data (issue once connected to Redshift using SQLWorkbench or the psql tool):
copy orders from 's3://redshift-demo/tpc-h/100/orders/orders.tbl.' CREDENTIALS 'aws_access_key_id=<access key>;aws_secret_access_key=<secret key>' gzip delimiter '|' COMPUPDATE ON;
2. Compression encoding can be done at the S3 copy command or when the table is defined:
A. S3 copy :copy orders from 's3://redshift-demo/tpc-h/100/orders/orders.tbl.' CREDENTIALS 'aws_access_key_id=<access key>;aws_secret_access_key=<secret key>' gzip delimiter '|' COMPUPDATE ON;
B. Table definition : product_name char(20) encode bytedict);
3. Full list of encoding types: http://docs.aws.amazon.com/redshift/latest/dg/c_Compression_encodings.html
1. S3 command to load data (issue once connected to Redshift using SQLWorkbench or the psql tool):
copy orders from 's3://redshift-demo/tpc-h/100/orders/orders.tbl.' CREDENTIALS 'aws_access_key_id=<access key>;aws_secret_access_key=<secret key>' gzip delimiter '|' COMPUPDATE ON;
2. Compression encoding can be done at the S3 copy command or when the table is defined:
A. S3 copy :copy orders from 's3://redshift-demo/tpc-h/100/orders/orders.tbl.' CREDENTIALS 'aws_access_key_id=<access key>;aws_secret_access_key=<secret key>' gzip delimiter '|' COMPUPDATE ON;
B. Table definition : product_name char(20) encode bytedict);
3. Full list of encoding types: http://docs.aws.amazon.com/redshift/latest/dg/c_Compression_encodings.html
A.
None : raw
B.
bytedict : In byte
dictionary encoding, a separate dictionary of unique values is created for each
block of column values on disk. (An Amazon Redshift disk block occupies 1 MB.)
The dictionary contains up to 256 one-byte values that are stored as indexes to
the original data values. If more than 256 values are stored in a single block,
the extra values are written into the block in raw, uncompressed form. The
process repeats for each disk block. This encoding is very effective when a
column contains a limited number of unique values.
C.
Delta : Delta encodings are very
useful for datetime columns. Delta encoding compresses data by recording the
difference between values that follow each other in the column.
D.
LZO : LZO encoding provides a very
high compression ratio with good performance. LZO encoding works well for CHAR
and VARCHAR columns that store very long character strings, especially free
form text, such as product descriptions, user comments, or JSON strings.
E.
Mostly encoding : Mostly encodings are useful when the
data type for a column is larger than most of the stored values require.
F.
Runlength encoding : Runlength encoding replaces a value
that is repeated consecutively with a token that consists of the value and a
count of the number of consecutive occurrences (the length of the run).
G.
Text255 and text32k encodings are
useful for compressing VARCHAR columns in which the same words recur often.
4.Every table you
create has three hidden columns appended to it: INSERT_XID, DELETE_XID, and
ROW_ID (OID). A table with 3 user-defined columns contains 6 actual columns, and
the user-defined columns are internally numbered as 0, 1, and 2. The
INSERT_XID, DELETE_XID, and ROW_ID columns are numbered 3, 4, and 5,
respectively, in this example.
5. Analyze command :By default, the COPY command performs an analysis after
it loads data into an empty table. To build or update statistics, run the ANALYZE command
against: the entire current database, single tab, or one or more specific columns in a single table
6. Vacuum command : In Redshift, every update is
effectively a delete and an insert.
Since Amazon Redshift does not automatically reclaim and reuse that is
freed when you delete rows from tables or update rows in tables, you should run
the VACUUM command to reclaim space following deletes.
Monday, December 9, 2013
Redshift example
This a great place to get started with Redshift: http://docs.aws.amazon.com/redshift/latest/gsg/getting-started.html. There are a couple of pieces of information in the step by step instructions I wanted to elaborate on:
1. Creating a cluster subnet group is done in the Redshift area of the console as seen below. This is not evident in the instructions:
This of course assumes a VPC with two public subnets and a route to an IGW has been created.
2. The copy command is issued from SQL Workbench.copy venue from 's3://awssampledb/tickit/venue_pipe.txt' CREDENTIALS 'aws_access_key_id=<your access key>;aws_secret_access_key=<you secret key>' delimiter '|';
3. Make sure to set auto commit otherwise you need to commit each statements or block of statements and the example does not have commit commands.
1. Creating a cluster subnet group is done in the Redshift area of the console as seen below. This is not evident in the instructions:
This of course assumes a VPC with two public subnets and a route to an IGW has been created.
2. The copy command is issued from SQL Workbench.copy venue from 's3://awssampledb/tickit/venue_pipe.txt' CREDENTIALS 'aws_access_key_id=<your access key>;aws_secret_access_key=<you secret key>' delimiter '|';
3. Make sure to set auto commit otherwise you need to commit each statements or block of statements and the example does not have commit commands.
Subscribe to:
Posts (Atom)