ラベル spanner の投稿を表示しています。 すべての投稿を表示
ラベル spanner の投稿を表示しています。 すべての投稿を表示

2022/12/14

what is good bulk insert

To get optimal write throughput for bulk loads, partition your data by primary key with this pattern:
  Each partition contains a range of consecutive rows. Each commit contains data for only a single partition. A good rule of thumb for your number of partitions is 10 times the number of nodes in your Cloud Spanner instance. So if you have N nodes, with a total of 10*N partitions, you can assign rows to partitions by:

  Sorting your data by primary key. Dividing it into 10*N separate sections. Creating a set of worker tasks that upload the data. Each worker will write to a single partition. Within the partition, it is recommended that your worker write the rows sequentially. However, writing data randomly within a partition should also provide reasonably high throughput.

  As more of your data is uploaded, Cloud Spanner automatically splits and rebalances your data to balance load on the nodes in your instance. During this process, you may experience temporary drops in throughput.

  Following this pattern, you should see a maximum overall bulk write throughput of 10-20 MiB per second per node.

2022/12/07

spanner nullable data type golang

 spanner nullable data type

```
// NullableValue is the interface implemented by all null value wrapper types.
type NullableValue interface {
// IsNull returns true if the underlying database value is null.
IsNull() bool
}
```
- how to use
- difference between null,"","someValue"
- null
- spanner.NullString(valid=false)
- ""
- spanner.NullString(valid=true,value="")
- update(value[*NullableValue]->model[NullableValue])
- update target items
- value is not nil
- not update target items
- value is nil
- vo is all pointer

by the way, another orm tool,gorm , don't solve this problem well.
var user User
db.First(&user)

db.Model(&user).Updates(User{Age: 18, Name: nil})
// 実行 SQL : UPDATE `users` SET `age` = '18'  WHERE `users`.`id` = '1'

GORM のドキュメントを見ると以下のように書いてあったので、構造体で実行すると無視されてしまうようです。 When query with struct, GORM will only query with those fields has non-zero value,

that means if your field's value is 0, '', false or other zero values,

it won't be used to build query conditions,

2022/11/22

cloud spanners shard

 - record->split->node
 - どこに保存されるはprimary keyで決める
- autoIncrementの仕組みがない、順番に保存するのは特定のnodeに集中するので、望ましくない(hotspot発生)
- autoIncrement以外のIDの生成方法

uuid 1~uuid4

  - uuid1はtimestampペース、mac addressを使う
連番になる可能性
- uuid4はランダムの乱数で、分散性が高い、衝突になる可能性が0ではない

snowflake

ULID

timestampを使ってIDを生成する場合、どうしても分散率が低い。

なので、
shardId=hash(key)%N
PRIMARY KEY(shardId,分散率低い値のcolumn)
の複合キーで対応する

内部のHash関数を使う。
 - farm_fingerprint
 - など