本文介紹了窗口函數在pd.read_sql中不起作用;它顯示錯誤的處理方法,對大家解決問題具有一定的參考價值,需要的朋友們下面隨著小編來一起學習吧!
問題描述
我目前在Google Collab(Jupyter筆記本)中使用歐洲足球SQLite數據庫進行數據分析。
分析的目的;對于特定的球隊ex:切爾西,獲取每場比賽的勝負標簽(使用Case語句完成),然后按賽季和勝負結果劃分比賽計數。
這一切都是在Google Collab(Jupyter筆記本)中的pd.Read_SQL()語句中完成的。
在引入窗口函數之前,該語句運行得很好。但是查詢在SQLite DB瀏覽器中運行得很好(附圖)。我得到的主要錯誤是OperationalError: near "(": syntax error
以下是代碼
Home_Perf = pd.read_sql(""" --- CTE to get the wins and loss as a home team
WITH Homes AS (
SELECT season, team_long_name AS HomeTeam,
home_team_goal, away_team_goal,
CASE
WHEN home_team_goal > away_team_goal THEN 'win'
WHEN home_team_goal < away_team_goal THEN 'loss'
ELSE 'Tie' END AS Win_Loss
FROM match
---Inner JOIN for getting the team name
INNER JOIN team
ON team_api_id = home_team_api_id
WHERE home_team_api_id = 8455)
SELECT season, HomeTeam,
COUNT(Win_Loss) OVER(PARTITION BY season) AS counts
FROM homes""", conn)
Home_Perf
以下是錯誤
ERROR:root:An unexpected error occurred while tokenizing input
The following traceback may be corrupted or invalid
The error message is: ('EOF in multi-line string', (1, 38))
---------------------------------------------------------------------------
OperationalError Traceback (most recent call last)
/usr/local/lib/python3.6/dist-packages/pandas/io/sql.py in execute(self, *args, **kwargs)
1585 try:
-> 1586 cur.execute(*args, **kwargs)
1587 return cur
OperationalError: near "(": syntax error
The above exception was the direct cause of the following exception:
DatabaseError Traceback (most recent call last)
3 frames
<ipython-input-17-9b1c924dbbdd> in <module>()
15 SELECT season, HomeTeam,
16 COUNT(Win_Loss) OVER(PARTITION BY season) AS counts
---> 17 FROM homes""", conn)
18 Home_Perf
/usr/local/lib/python3.6/dist-packages/pandas/io/sql.py in read_sql(sql, con, index_col, coerce_float, params, parse_dates, columns, chunksize)
410 coerce_float=coerce_float,
411 parse_dates=parse_dates,
--> 412 chunksize=chunksize,
413 )
414
/usr/local/lib/python3.6/dist-packages/pandas/io/sql.py in read_query(self, sql, index_col, coerce_float, params, parse_dates, chunksize)
1631
1632 args = _convert_params(sql, params)
-> 1633 cursor = self.execute(*args)
1634 columns = [col_desc[0] for col_desc in cursor.description]
1635
/usr/local/lib/python3.6/dist-packages/pandas/io/sql.py in execute(self, *args, **kwargs)
1596
1597 ex = DatabaseError(f"Execution failed on sql '{args[0]}': {exc}")
-> 1598 raise ex from exc
1599
1600 @staticmethod
DatabaseError: Execution failed on sql ' --- CTE to get the wins and loss as a home team
WITH Homes AS (
SELECT season, team_long_name AS HomeTeam,
home_team_goal, away_team_goal,
CASE
WHEN home_team_goal > away_team_goal THEN 'win'
WHEN home_team_goal < away_team_goal THEN 'loss'
ELSE 'Tie' END AS Win_Loss
FROM match
---Inner JOIN for getting the team name
INNER JOIN team
ON team_api_id = home_team_api_id
WHERE home_team_api_id = 8455)
SELECT season, HomeTeam,
COUNT(Win_Loss) OVER(PARTITION BY season) AS counts
FROM homes': near "(": syntax error
推薦答案
tl;drGoogle Colab使用的是SQLite 3.22版,但SQLite僅支持3.25版的窗口函數。
我認為問題在于Google Colab使用的SQLlite版本過時,不支持窗口函數。谷歌不得不承認這一點!我寫這篇文章是從2022年3月6日開始。
切中您的觀點,更具體地說:
2018年9月15日,SQLite發布了3.25版本,正如您在發布日志here中看到的那樣,他們在此版本中做的第一件事是:
添加對窗口函數的支持
這是我以前遇到過的問題,訣竅是更新您的SQLite3庫,但是,您不是在本地設備上,而是在Google Colab上。
因此,您要做的下一件事是檢查您的SQLlite版本。您可以通過運行下面這行SQL代碼來實現這一點:SELECT sqlite_version();
我從here那里得到了這個好東西。
由于您是在Pandas內進行查詢,因此您將運行:
pd.read_sql_query("""
SELECT sqlite_version();
""", conn)
所以我今天(3/6/22)正好運行了這個程序,在我的Google Colab中得到了以下內容:
這個版本太舊了。我不知道如何更新Google Colab中的Sqlite庫。所以這就是我所得到的,我認為我們目前還沒有那個功能。This post可能有助于更新筆記本中的SQLite。
這篇關于窗口函數在pd.read_sql中不起作用;它顯示錯誤的文章就介紹到這了,希望我們推薦的答案對大家有所幫助,