ra3.xlplus
local disk and stv_partitions
Mmm. Interesting. ra3.xlplus
nodes seem to have changed
a bit - I think they used to have two disk partitions of 932gb, one for
their own data and the other being one k-safety. I see this still with
dc2.large
(but with smaller disks of course). Now in
ra3.xlplus
however I see a single disk partition of twice
that size. Half of that must be being used for k-safety, but
stv_partitions
can’t indicate this any more, since there’s
only one disk, so owner
now equals host
. On
the face of it, stv_partitions
is now showing incorrect
data for ra3.xlplus
.
ra3.large
First ra3.large
benchmarks;
https://www.redshift-observatory.ch/cross_region_benchmarks/index.html
Failed to start, or is unavailable, in us-east-1
.
Bring up times for ra3.large
are long.
https://www.redshift-observatory.ch/bring_up_times/index.html
Running a leader-node only query from a view is about 0.5s to 1s
faster than running the same SQL by issuing it directly (this on an
ra3.16xlarge
). Postgres (the leader node) stores views in a
parsed form, not their original SQL, so creating a view I would guess is
performing work which otherwise has to be performed when the query is
issued.
I am coming to the view that layered views to do not work, because it becomes too hard to know what’s going on with joins, and in RS, you need to, and can, avoid a lot of joins by using window functions, and you really need to do that, for performance, so you have to know what’s going on with joins, and layered views make that too difficult.
This means then that each view has to be whole and complete in itself.
In practice you can have some layering - but those layers in fact almost cannot contain joins.
Little discovery. There are three different formats for the string
representation of an interval, and the user can switch between them
using SET
.
dev=# select interval '10 day 5 hour 16 minute 55 second';
interval
------------------
10 days 05:16:55
(1 row)
Time: 125.447 ms
dev=# set intervalstyle to postgres_verbose;
SET
Time: 217.357 ms
dev=# select interval '10 day 5 hour 16 minute 55 second';
interval
-----------------------------------
@ 10 days 5 hours 16 mins 55 secs
(1 row)
Time: 126.878 ms
dev=# set intervalstyle to sql_standard;
SET
Time: 213.241 ms
dev=# select interval '10 day 5 hour 16 minute 55 second';
interval
--------------
10 5:16:55.0
(1 row)
Time: 208.436 ms
dev=# exit
So you can’t rely on the string output being what you expect.
sf.copy_errors
Check this - a view from the upcoming replacement system tables which
shows, in human readable form, the single most recent COPY
error for your user - what you see here is actual output, not formatting
from psql
. I’ve done it this way because often columns have
wide values, which means the line often wraps, and then it’s hard to
read.
prod=# select * from sf.copy_errors ;
key | value
-----------+--------------------------------------------------------------------------------------------
event_ts | 2024-10-30 17:39:24.228849
column | public.test_table.column_1
reason | Delimiter not found
file_name | s3://wib-aoplop-bucket/aoplop/wib_for_aoplop/20240901-20241001/wib_for_aoplop-00005.csv.gz
line | 2
value | alpha_beta_gamma
(6 rows)
Nice, huh? :-)
Home 3D Друк Blog Bring-Up Times Cross-Region Benchmarks Email Forums Mailing Lists Redshift Price Tracker Redshift Version Tracker Replacement System Tables Reserved Instances Marketplace Slack System Table Tracker The Known Universe White Papers