To construct DataFrame more effectively
The old code of python looks like:import pandas as pd temp = pd.DataFrame() for record in table: df = pd.DataFrame(record) temp = pd.concat([temp, df]) # The final result result = tempThis snippet...
View ArticleSome tips about Argo Workflows (on Kubernetes)
Using Argo to execute workflows last week, I met some problems and also find the solutions. 1. Can’t parse “outputs” By submitting this YAML file:apiVersion: argoproj.io/v1alpha1 kind: Workflow...
View ArticleGrab a hands-on realtime-object-detection tool
Try to get a fast (what I mean is detecting in lesss than 1 second on mainstream CPU) object-detection tool from Github, I experiment with some repositories written by PyTorch (because I am familiar...
View ArticleBe careful of the ternary operator in Python
from pathlib import Path date = yes my_path = Path(hello) / date if date else no / last print(my_path) The result will be: hello/yes Where is the last go? It goes with the no. The python interpreter...
View ArticleImage pull policy in Kubernetes
Recently, we use Kubernetes for our project. Yesterday, a problem haunted me severely: even I have pushed the docker image to the GCR (Goolge Container Registry), the pod in Kubernetes will still use...
View ArticleUse both ‘withParam’ and ‘when’ in Argo Workflows (on Kubernetes)
In Argo, we can use ‘withParam’ to create loop logic: - - name: generate template: gen-number-list # Iterate over the list of numbers generated by the generate step above - - name: sleep template:...
View ArticleProblems about using treelite in Kubernetes
treelite is an easy-to-use tool to accelerate prediction speed of XGBoost/LightGBM models. Three days ago, I tested it in my 4-CPU-cores virtual machine and found out that it could reduce the running...
View ArticleRecent learned tips abou Numpy and Pandas
Precision After running this snippet:import numpy as np a = np.array([0.112233445566778899], dtype=np.float32) b = np.array([0.112233445566778899], dtype=np.float64) print(a, b)It print...
View ArticleUse `psql` to download data as CSV file
Although SQL WorkBench is a handy tool for querying AWS Redshift, we still need to CLI tool for automation. To install psql on MacOS, we need tobrew install postgresqlThen we could download data...
View ArticleThe nn.Sigmoid() of PyTorch on Android device
I have trained an EfficientNet model to classify more than ten thousand different categories of birds, by using PyTorch. To run this model on the mobile device, I built a program by learning the...
View ArticleSome tips about BigQuery on GCP
Migrate SQL script from AWS Redshift to BigQueryCONVERT_TIMEZONE('AEDT',getdate())::DATE in Redshift should be changed to current_date("Australia/Sydney") in BigQuery. Since BigQuery doesn’t force type...
View ArticleA problem about running Argo
After I launched an Argo workflow, it just hanged on ContainerCreating stage. After waiting for more than 10 minutes, it hasn’t changed at all. Then I found this article. After usingkubectl describe...
View ArticleFailed to establish pod watch in Argo
After creating a brand new Kubernetes cluster in GKE, I launched an Argo workflow but saw these errors: Argo will create two containers for a step: ‘main’ container and ‘wait’ container. But why the...
View ArticleExperiments on Bayesian Optimization
Bayesian Optimization is a popular searching algorithm for hyper-parameters in the machine learning area. There are also two popular Python libraries for this algorithm: Hyperopt and Optuna. So I have...
View ArticleFirst trial for PyCaret
The first time I noticed PyCaret is from the recommendation page from Google Chrome. Recent days I got time to test it. The original test program failed because there is a column which has a number...
View ArticleImage pull policy in Kubernetes
Recently, we use Kubernetes for our project. Yesterday, a problem haunted me severely: even I have pushed the docker image to the GCR (Goolge Container Registry), the pod in Kubernetes will still use...
View ArticleThe first trial for PyCaret
The code for using PyCaret is quite simple: df = pd.read_csv(TRAIN_CSV_FILE) setup(data=data, target="TARGET", session_id=1023) compare_models(verbose = False) But it reported error in the first run:...
View ArticleSubmit Argo workflow to different clusters
A couple of days ago I am looking for a tool to manage different Kubernetes Clusters in my only laptop. But after a while, I realized that kubectl actually support multi-clusters by itself (link)....
View ArticleUsing GPU for LightGBM
One of my team members had accomplished some tests on using GPU for LightGBM training. The result is quite good that GPU could accelerate training speed to 2 times fast. But this also rises up my...
View ArticleUsing loop in Jsonnet
Jsonnet is a templating language and tool to generate JSON/YAML files. Since already have a language instead of configuration, we can generate a bunch of configuration issues with simple code. For...
View Article