A tip about Terraform
Terraform is a interesting (in my opinion) tool to implement Infrastructure-as-Code. When I first used it to write production script at yesterday, I met a error report:Error: error validating provider...
View ArticleThe uneasy way to implement SSDLite by myself
SSDLite is a variant of Single Shot Multi-box Detection. It uses MobileNetV2 instead of VGG as backbone. Thus it can make detection extremely fast. I was trying to implement SSDLite from the code base...
View ArticleAccelerate the speed of data loading in PyTorch
I got a desktop computer to train deep learning model last week. The GPU is GTX1050TI with 4GB memory which is enough for basic training on object detection. But the CPU is too old. Therefore when I...
View ArticleGoogle Cloud Summit 2019
Yesterday I joined the Google Cloud Summit 2019 in Sydney. The meeting place is quite huge. And there are lot of booths from different partners of Google Cloud. The keynote was quite abstract and a...
View ArticleSome tips about using AWS Glue
Configure about data format To use AWS Glue, I write a ‘catalog table’ into my Terraform script:resource "aws_glue_catalog_table" "my_table" { ... input_format =...
View ArticleProcessing date and time in AWS Redshift
Since AWS Redshift don’t have function like FROM_UNIX(), it’s much more weird to get formatted time from a UNIX timestamp (called ‘epoch’ in Reshift):SELECT timestamp 'epoch' + my_timestamp_column *...
View ArticleSome problems about using AWS DMS
AWS DMS is a new type of service used to migrate data from different types of database and data-warehouse. I met some problems when trying to use it in production environment. Problem 1. When using a...
View ArticleA problem of using Pyspark SQL
Here is the code:from pyspark.sql import SQLContext from pyspark.context import SparkContext from pyspark.sql.types import * from typing import List sc = SparkContext() sqlContext =...
View ArticleAn example of using Spark Structured Streaming
This snippet will monitor two directories and join the data from them when there is a new CSV file in any directory.from pyspark.sql import SQLContext from pyspark.context import SparkContext from...
View ArticleThe MySQL master-slave drift problem in AWS
About one month ago, we met a problem in MySQL master-slave architecture on AWS ec2. The MySQL master runs very fast, but the slave can only get the new data from about two or three hours ago. We...
View ArticleUsing Single Shot Detection to detect birds (Episode four)
In the previous article, I reached mAP 0.770 for VOC2007 test. Four months has past. After trying a lot of interesting ideas from different papers, such as FPN, celu, RFBNet, I finally realised that...
View ArticleA convenient environment to write LaTex
More than one year ago, I wrote a paper about how to accelerate Deep Learning training for sparse features and dense features (images). For writing this paper, I installed a bunch of tools and plugins...
View ArticleA problem about using DataFrame in Apache Spark
Here is the code for loading CSV file (table employee) to DataFrame of Apache Spark:val schema = StructType( Seq( StructField("id", LongType), StructField("birthday", DateType),...
View ArticleThe generating speed for random number in Python3
Just want to generate random number in a range (no matter float or integer) by using Python. Since I only need to get a random number in my code once a time, the speed for calling the...
View ArticleBooks I read in year 2019
At the beginning of 2019, I finished the book “The Great Siege: Malta 1565”. The story about a few loyal knights protecting Europe from the Ottoman Empire is so extraordinary that it encouraged me to...
View ArticleTips about pytest
1. Error for “fixture ‘mocker’ not found” After running pytest, it reported:E fixture 'mocker' not found > available fixtures: cache, capfd, capsys, doctest_namespace, mock, mocker, monkeypatch,...
View ArticleHow to ignore illegal sample of dataset in PyTorch?
I have implemented a dataset class for my image samples. But it can’t handle the situation that a corrupted image has been read:import torch.utils.data as data class MyDataset(data.Dataset): ... def...
View ArticleDirectly deploy containers on GCP VM instance
We can directly deploy containers into VM instance of Google Compute Engine, instead of launching a heavy Kubernetes cluster. The command looks like:gcloud compute instances create-with-container...
View ArticleProblem about installing Kubeflow
Try to install Kubeflow by following this guide. But when I runkfctl apply -V -f https://raw.githubusercontent.com/kubeflow/manifests/v0.7-branch/kfdef/kfctl_k8s_istio.0.7.1.yamlit reportsINFO[0000]...
View ArticleSome problems when using GCP
After I launched a compute engine with container, it report error: gcr.io/xx/xx-xx/feature:yy Feb 03 00:12:28 xx-d19b201 konlet-startup[4664]: {“errorDetail”:{“message”:”failed to register layer: Error...
View Article