• Colab 讀取資料 : Github、Google Drive、本地端

Github

直接讀取

import pandas as pd

url = 'https://raw.githubusercontent.com/06Cata/Kaggle_Titanic/main/raw_data/train.csv'

df = pd.read_csv(url)

df.head()

螢幕擷取畫面 2024-06-23 173348

存成臨時文件再讀取

import pandas as pd
import tempfile
import requests

url = 'https://raw.githubusercontent.com/06Cata/Kaggle_Titanic/main/raw_data/train.csv'
response = requests.get(url)

# 數據保存到臨時文件中
temp_file = tempfile.NamedTemporaryFile(delete=False)
temp_file.write(response.content)
temp_file_path = temp_file.name

df2 = pd.read_csv(train_file_path)
df2

螢幕擷取畫面 2024-06-23 173348

Google Drive

!pip install -U -q PyDrive

import pandas as pd
from pydrive.auth import GoogleAuth
from pydrive.drive import GoogleDrive
from google.colab import auth
from oauth2client.client import GoogleCredentials

# 認證,這裡會跳出google登入頁面,選自己的帳號按繼續
auth.authenticate_user()
gauth = GoogleAuth()
gauth.credentials = GoogleCredentials.get_application_default()
drive = GoogleDrive(gauth)

# google drive 文件夾連結
link = 'https://drive.google.com/file/d/1tPJeYAHyFGhHmdDTSzvuiBSXbVH3uWgM/view?usp=sharing'

# 提取文件夾ID
file_id = link.split('/d/')[1].split('/')[0]

# download
downloaded = drive.CreateFile({'id': file_id})
downloaded.GetContentFile('train.csv')

df_drive = pd.read_csv('train.csv')
df_drive

螢幕擷取畫面 2024-06-23 180521

本地端

用上傳文件的方式
會先跳出【選擇檔案】,直接選本地端的檔案

from google.colab import files
import pandas as pd
import io

uploaded = files.upload()

file_name = 'train.csv'
df3 = pd.read_csv(io.BytesIO(uploaded[file_name]))

df3.head()

螢幕擷取畫面 2024-06-23 174454

Catalina
Catalina

Hi, I’m Catalina!
原本在西語市場做開發業務,2023 年正式轉職資料領域。
目前努力補齊計算機組織、微積分、線性代數與機率論,忙碌中做點筆記提醒自己 🤲

文章: 43

發佈留言

發佈留言必須填寫的電子郵件地址不會公開。 必填欄位標示為 *