Interpreting user request
I need to understand the user's request in Chinese about football league key player data statistics. It's a bit vague, and they could need help with statistical analysis, script creation, or a report. I should ask clarifying questions: which league, data sources, metrics, timeframe, output format, coding language, and if a dataset exists. Also, I'm considering our environment settings; it seems I need to follow protocols regarding running commands judiciously. I plan to gather data, define metrics, and compute relevant statistics.Proposing coding options
I need to provide a Python code template using pandas for the user's request. I could suggest using CSVs from football-data.co.uk or scraping from FBref, but I must check if network approvals are needed. I’ll propose working offline and ask the user for data or permission to scrape. I can offer three options: 1) they share a CSV, and I’ll write the script; 2) they approve fetching data from certain sites; or 3) they provide the metrics and timeframe. I could include a sample script to calculate key metrics and rankings.Using StatsBomb data for analysis
I can use the StatsBomb open data from GitHub for both men's and women's football, even without scraping, although it's a bit more complex. Since the user might be Chinese-speaking, I should respond in Chinese. I’ll outline a structured plan with bullet points that clarify the scope, data source, metrics, and output. I’ll also provide a code skeleton, ask key questions, and define next steps. Following the guidelines, I'll keep the formatting light. I can also offer to adapt mapping if the user sends a sample row.Computing player metrics
I’ll compute metrics like npxG, where we set pens_xG to 0 unless a column exists. Then, I want to calculate values like npxG90, npxG+xA per 90, shot volume, key passes, SCA, GCA, and defensive actions, among others, while checking for missing columns. I plan to group percentiles by position. I can also provide a small notebook or CLI tool if the user wants that. ASCII is fine for coding, and I’ll keep it concise while ensuring clarity in the response.
想做哪种统计?先帮我确认这些关键信息:
- 联赛与赛季范围(例如 2024-25 英超,或最近3赛季)
- 数据来源(你有现成CSV/Excel吗?还是让我抓 FBref/Understat/StatsBomb 开源数据)
- 指标口味与输出形式(榜单/表格/雷达图;导出 CSV/Excel/图片;是否分位置)
- 最低上场门槛(如 ≥900 分钟)与是否做每90分钟标准化与百分位
建议统计维度(按位置分层并做每90 + 百分位):
- 进攻产出:非点球进球、助攻、npxG、xA、射门、关键传球、SCA/GCA
- 推进威胁:推进传球/带球、进攻三区传球、进禁区传球
- 终结效率:进球 - xG、每射门进球、射正率
- 组织与连接:传球尝试/成功率、三分之一/禁区传球成功率
- 防守与对抗:抢断、拦截、压迫、空中对抗胜率、地面对抗
- 可用性:出场分钟、首发次数、位置标签(FW/MF/DF,或更细分)
- 综合:按权重或PCA构建“关键球员评分”,并输出榜单
下面给你一个可直接跑的 Python/pandas 脚本雏形:读取一个球员赛季汇总 CSV,计算 per90、百分位与综合评分,并导出榜单。你只需把列名映射到你实际数据即可。
# file: key_players.pyimport pandas as pd
\n
import numpy as np# 配置INPUT = [[NEWS.CONTENT_HTML]]quot;players.csv[[NEWS.CONTENT_HTML]]quot; # 你的球员赛季级数据OUT_LEADERS = [[NEWS.CONTENT_HTML]]quot;leaders.csv[[NEWS.CONTENT_HTML]]quot; # 联赛榜单OUT_BY_TEAM = [[NEWS.CONTENT_HTML]]quot;by_team_top.csv[[NEWS.CONTENT_HTML]]quot;MIN_MINUTES = 900 # 最低出场门槛POS_MAP = {[[NEWS.CONTENT_HTML]]quot;F[[NEWS.CONTENT_HTML]]quot;:[[NEWS.CONTENT_HTML]]quot;FW[[NEWS.CONTENT_HTML]]quot;,[[NEWS.CONTENT_HTML]]quot;M[[NEWS.CONTENT_HTML]]quot;:[[NEWS.CONTENT_HTML]]quot;MF[[NEWS.CONTENT_HTML]]quot;,[[NEWS.CONTENT_HTML]]quot;D[[NEWS.CONTENT_HTML]]quot;:[[NEWS.CONTENT_HTML]]quot;DF[[NEWS.CONTENT_HTML]]quot;,[[NEWS.CONTENT_HTML]]quot;GK[[NEWS.CONTENT_HTML]]quot;:[[NEWS.CONTENT_HTML]]quot;GK[[NEWS.CONTENT_HTML]]quot;} # 粗分位置# 期望的列(可缺省,脚本会兜底为0)EXPECTED = { [[NEWS.CONTENT_HTML]]quot;player[[NEWS.CONTENT_HTML]]quot;:[[NEWS.CONTENT_HTML]]quot;player[[NEWS.CONTENT_HTML]]quot;, [[NEWS.CONTENT_HTML]]quot;team[[NEWS.CONTENT_HTML]]quot;:[[NEWS.CONTENT_HTML]]quot;team[[NEWS.CONTENT_HTML]]quot;, [[NEWS.CONTENT_HTML]]quot;pos[[NEWS.CONTENT_HTML]]quot;:[[NEWS.CONTENT_HTML]]quot;pos[[NEWS.CONTENT_HTML]]quot;, [[NEWS.CONTENT_HTML]]quot;minutes[[NEWS.CONTENT_HTML]]quot;:[[NEWS.CONTENT_HTML]]quot;minutes[[NEWS.CONTENT_HTML]]quot;, [[NEWS.CONTENT_HTML]]quot;goals[[NEWS.CONTENT_HTML]]quot;:[[NEWS.CONTENT_HTML]]quot;goals[[NEWS.CONTENT_HTML]]quot;, [[NEWS.CONTENT_HTML]]quot;pens_made[[NEWS.CONTENT_HTML]]quot;:[[NEWS.CONTENT_HTML]]quot;pens_made[[NEWS.CONTENT_HTML]]quot;, [[NEWS.CONTENT_HTML]]quot;assists[[NEWS.CONTENT_HTML]]quot;:[[NEWS.CONTENT_HTML]]quot;assists[[NEWS.CONTENT_HTML]]quot;, [[NEWS.CONTENT_HTML]]quot;shots[[NEWS.CONTENT_HTML]]quot;:[[NEWS.CONTENT_HTML]]quot;shots[[NEWS.CONTENT_HTML]]quot;, [[NEWS.CONTENT_HTML]]quot;key_passes[[NEWS.CONTENT_HTML]]quot;:[[NEWS.CONTENT_HTML]]quot;key_passes[[NEWS.CONTENT_HTML]]quot;, [[NEWS.CONTENT_HTML]]quot;xG[[NEWS.CONTENT_HTML]]quot;:[[NEWS.CONTENT_HTML]]quot;xG[[NEWS.CONTENT_HTML]]quot;, [[NEWS.CONTENT_HTML]]quot;xA[[NEWS.CONTENT_HTML]]quot;:[[NEWS.CONTENT_HTML]]quot;xA[[NEWS.CONTENT_HTML]]quot;, [[NEWS.CONTENT_HTML]]quot;sca[[NEWS.CONTENT_HTML]]quot;:[[NEWS.CONTENT_HTML]]quot;sca[[NEWS.CONTENT_HTML]]quot;, [[NEWS.CONTENT_HTML]]quot;gca[[NEWS.CONTENT_HTML]]quot;:[[NEWS.CONTENT_HTML]]quot;gca[[NEWS.CONTENT_HTML]]quot;, [[NEWS.CONTENT_HTML]]quot;progressive_passes[[NEWS.CONTENT_HTML]]quot;:[[NEWS.CONTENT_HTML]]quot;prog_passes[[NEWS.CONTENT_HTML]]quot;, [[NEWS.CONTENT_HTML]]quot;progressive_carries[[NEWS.CONTENT_HTML]]quot;:[[NEWS.CONTENT_HTML]]quot;prog_carries[[NEWS.CONTENT_HTML]]quot;, [[NEWS.CONTENT_HTML]]quot;passes_into_final_third[[NEWS.CONTENT_HTML]]quot;:[[NEWS.CONTENT_HTML]]quot;p_final3[[NEWS.CONTENT_HTML]]quot;, [[NEWS.CONTENT_HTML]]quot;passes_into_penalty_area[[NEWS.CONTENT_HTML]]quot;:[[NEWS.CONTENT_HTML]]quot;p_box[[NEWS.CONTENT_HTML]]quot;, [[NEWS.CONTENT_HTML]]quot;tackles[[NEWS.CONTENT_HTML]]quot;:[[NEWS.CONTENT_HTML]]quot;tackles[[NEWS.CONTENT_HTML]]quot;, [[NEWS.CONTENT_HTML]]quot;interceptions[[NEWS.CONTENT_HTML]]quot;:[[NEWS.CONTENT_HTML]]quot;interceptions[[NEWS.CONTENT_HTML]]quot;, [[NEWS.CONTENT_HTML]]quot;pressures[[NEWS.CONTENT_HTML]]quot;:[[NEWS.CONTENT_HTML]]quot;pressures[[NEWS.CONTENT_HTML]]quot;, [[NEWS.CONTENT_HTML]]quot;aerials_won[[NEWS.CONTENT_HTML]]quot;:[[NEWS.CONTENT_HTML]]quot;aerials_won[[NEWS.CONTENT_HTML]]quot;, [[NEWS.CONTENT_HTML]]quot;aerials_lost[[NEWS.CONTENT_HTML]]quot;:[[NEWS.CONTENT_HTML]]quot;aerials_lost[[NEWS.CONTENT_HTML]]quot;
\n
}def load(): df = pd.read_csv(INPUT) # 标准化列名:尽量匹配 EXPECTED 指向的名字 # 你可以直接把 EXPECTED 的 value 改成你CSV的真实列名 rename = {v:k for k,v in EXPECTED.items() if v in df.columns} df = df.rename(columns=rename) # 为缺失列补0 for k in EXPECTED.keys(): if k not in df.columns: df[k] = 0 # 位置粗分 def map_pos(p): if isinstance(p,str) and p: c = p[0].upper() return POS_MAP.get(c, c) return [[NEWS.CONTENT_HTML]]quot;UNK[[NEWS.CONTENT_HTML]]quot;
\n
df[[[NEWS.CONTENT_HTML]]quot;pos_grp[[NEWS.CONTENT_HTML]]quot;] = df[[[NEWS.CONTENT_HTML]]quot;pos[[NEWS.CONTENT_HTML]]quot;].apply(map_pos) return dfdef per90(df): m = df[[[NEWS.CONTENT_HTML]]quot;minutes[[NEWS.CONTENT_HTML]]quot;].replace(0, np.nan) def p90(x): return x / m * 90 df[[[NEWS.CONTENT_HTML]]quot;npxG[[NEWS.CONTENT_HTML]]quot;] = df[[[NEWS.CONTENT_HTML]]quot;xG[[NEWS.CONTENT_HTML]]quot;] - 0.0 # 若有点球xG列可减去 df[[[NEWS.CONTENT_HTML]]quot;npxG90[[NEWS.CONTENT_HTML]]quot;] = p90(df[[[NEWS.CONTENT_HTML]]quot;npxG[[NEWS.CONTENT_HTML]]quot;]) df[[[NEWS.CONTENT_HTML]]quot;xA90[[NEWS.CONTENT_HTML]]quot;] = p90(df[[[NEWS.CONTENT_HTML]]quot;xA[[NEWS.CONTENT_HTML]]quot;]) df[[[NEWS.CONTENT_HTML]]quot;G90[[NEWS.CONTENT_HTML]]quot;] = p90(df[[[NEWS.CONTENT_HTML]]quot;goals[[NEWS.CONTENT_HTML]]quot;]) df[[[NEWS.CONTENT_HTML]]quot;A90[[NEWS.CONTENT_HTML]]quot;] = p90(df[[[NEWS.CONTENT_HTML]]quot;assists[[NEWS.CONTENT_HTML]]quot;]) df[[[NEWS.CONTENT_HTML]]quot;Shots90[[NEWS.CONTENT_HTML]]quot;] = p90(df[[[NEWS.CONTENT_HTML]]quot;shots[[NEWS.CONTENT_HTML]]quot;]) df[[[NEWS.CONTENT_HTML]]quot;KP90[[NEWS.CONTENT_HTML]]quot;] = p90(df[[[NEWS.CONTENT_HTML]]quot;key_passes[[NEWS.CONTENT_HTML]]quot;]) df[[[NEWS.CONTENT_HTML]]quot;SCA90[[NEWS.CONTENT_HTML]]quot;] = p90(df[[[NEWS.CONTENT_HTML]]quot;sca[[NEWS.CONTENT_HTML]]quot;]) df[[[NEWS.CONTENT_HTML]]quot;GCA90[[NEWS.CONTENT_HTML]]quot;] = p90(df[[[NEWS.CONTENT_HTML]]quot;gca[[NEWS.CONTENT_HTML]]quot;]) df[[[NEWS.CONTENT_HTML]]quot;ProgAct90[[NEWS.CONTENT_HTML]]quot;] = p90(df[[[NEWS.CONTENT_HTML]]quot;progressive_passes[[NEWS.CONTENT_HTML]]quot;] + df[[[NEWS.CONTENT_HTML]]quot;progressive_carries[[NEWS.CONTENT_HTML]]quot;]) df[[[NEWS.CONTENT_HTML]]quot;Final3P90[[NEWS.CONTENT_HTML]]quot;] = p90(df[[[NEWS.CONTENT_HTML]]quot;passes_into_final_third[[NEWS.CONTENT_HTML]]quot;]) df[[[NEWS.CONTENT_HTML]]quot;BoxP90[[NEWS.CONTENT_HTML]]quot;] = p90(df[[[NEWS.CONTENT_HTML]]quot;passes_into_penalty_area[[NEWS.CONTENT_HTML]]quot;]) df[[[NEWS.CONTENT_HTML]]quot;DefAct90[[NEWS.CONTENT_HTML]]quot;] = p90(df[[[NEWS.CONTENT_HTML]]quot;tackles[[NEWS.CONTENT_HTML]]quot;] + df[[[NEWS.CONTENT_HTML]]quot;interceptions[[NEWS.CONTENT_HTML]]quot;]) df[[[NEWS.CONTENT_HTML]]quot;Press90[[NEWS.CONTENT_HTML]]quot;] = p90(df[[[NEWS.CONTENT_HTML]]quot;pressures[[NEWS.CONTENT_HTML]]quot;]) # 空战胜率 air_total = df[[[NEWS.CONTENT_HTML]]quot;aerials_won[[NEWS.CONTENT_HTML]]quot;] + df[[[NEWS.CONTENT_HTML]]quot;aerials_lost[[NEWS.CONTENT_HTML]]quot;] df[[[NEWS.CONTENT_HTML]]quot;AerialWin%[[NEWS.CONTENT_HTML]]quot;] = np.where(air_total[[NEWS.CONTENT_HTML]]gt;0, df[[[NEWS.CONTENT_HTML]]quot;aerials_won[[NEWS.CONTENT_HTML]]quot;]/air_total, np.nan) # 进攻综合 df[[[NEWS.CONTENT_HTML]]quot;npxG+xA90[[NEWS.CONTENT_HTML]]quot;] = df[[[NEWS.CONTENT_HTML]]quot;npxG90[[NEWS.CONTENT_HTML]]quot;] + df[[[NEWS.CONTENT_HTML]]quot;xA90[[NEWS.CONTENT_HTML]]quot;] # 终结效率(简单版) df[[[NEWS.CONTENT_HTML]]quot;G-xG[[NEWS.CONTENT_HTML]]quot;] = df[[[NEWS.CONTENT_HTML]]quot;goals[[NEWS.CONTENT_HTML]]quot;] - df[[[NEWS.CONTENT_HTML]]quot;xG[[NEWS.CONTENT_HTML]]quot;] return dfdef percentiles(df, cols, by=[[NEWS.CONTENT_HTML]]quot;pos_grp[[NEWS.CONTENT_HTML]]quot;): # 按位置分组做百分位 def pct(s, v): # 以nan安全的方式计算百分位 return (s [[NEWS.CONTENT_HTML]]lt; v).mean() if np.isfinite(v) else np.nan for c in cols: pname = c + [[NEWS.CONTENT_HTML]]quot;_pct[[NEWS.CONTENT_HTML]]quot; df[pname] = np.nan for g, sub in df.groupby(by): s = sub[c].astype(float) ranks = s.rank(pct=True, method=[[NEWS.CONTENT_HTML]]quot;average[[NEWS.CONTENT_HTML]]quot;) df.loc[sub.index, pname] = ranks.values return dfdef composite_score(df): # 不同位置可用不同权重,这里提供一个默认权重 weights = { [[NEWS.CONTENT_HTML]]quot;FW[[NEWS.CONTENT_HTML]]quot;: {[[NEWS.CONTENT_HTML]]quot;npxG90_pct[[NEWS.CONTENT_HTML]]quot;:0.35, [[NEWS.CONTENT_HTML]]quot;xA90_pct[[NEWS.CONTENT_HTML]]quot;:0.15, [[NEWS.CONTENT_HTML]]quot;Shots90_pct[[NEWS.CONTENT_HTML]]quot;:0.10, [[NEWS.CONTENT_HTML]]quot;KP90_pct[[NEWS.CONTENT_HTML]]quot;:0.10, [[NEWS.CONTENT_HTML]]quot;SCA90_pct[[NEWS.CONTENT_HTML]]quot;:0.10, [[NEWS.CONTENT_HTML]]quot;ProgAct90_pct[[NEWS.CONTENT_HTML]]quot;:0.05, [[NEWS.CONTENT_HTML]]quot;DefAct90_pct[[NEWS.CONTENT_HTML]]quot;:0.05, [[NEWS.CONTENT_HTML]]quot;Press90_pct[[NEWS.CONTENT_HTML]]quot;:0.05, [[NEWS.CONTENT_HTML]]quot;AerialWin%_pct[[NEWS.CONTENT_HTML]]quot;:0.05}, [[NEWS.CONTENT_HTML]]quot;MF[[NEWS.CONTENT_HTML]]quot;: {[[NEWS.CONTENT_HTML]]quot;xA90_pct[[NEWS.CONTENT_HTML]]quot;:0.20, [[NEWS.CONTENT_HTML]]quot;KP90_pct[[NEWS.CONTENT_HTML]]quot;:0.15, [[NEWS.CONTENT_HTML]]quot;SCA90_pct[[NEWS.CONTENT_HTML]]quot;:0.15, [[NEWS.CONTENT_HTML]]quot;ProgAct90_pct[[NEWS.CONTENT_HTML]]quot;:0.15, [[NEWS.CONTENT_HTML]]quot;Final3P90_pct[[NEWS.CONTENT_HTML]]quot;:0.10, [[NEWS.CONTENT_HTML]]quot;BoxP90_pct[[NEWS.CONTENT_HTML]]quot;:0.05, [[NEWS.CONTENT_HTML]]quot;DefAct90_pct[[NEWS.CONTENT_HTML]]quot;:0.10, [[NEWS.CONTENT_HTML]]quot;Press90_pct[[NEWS.CONTENT_HTML]]quot;:0.05, [[NEWS.CONTENT_HTML]]quot;npxG90_pct[[NEWS.CONTENT_HTML]]quot;:0.05}, [[NEWS.CONTENT_HTML]]quot;DF[[NEWS.CONTENT_HTML]]quot;: {[[NEWS.CONTENT_HTML]]quot;DefAct90_pct[[NEWS.CONTENT_HTML]]quot;:0.30, [[NEWS.CONTENT_HTML]]quot;Press90_pct[[NEWS.CONTENT_HTML]]quot;:0.10, [[NEWS.CONTENT_HTML]]quot;AerialWin%_pct[[NEWS.CONTENT_HTML]]quot;:0.15, [[NEWS.CONTENT_HTML]]quot;ProgAct90_pct[[NEWS.CONTENT_HTML]]quot;:0.10, [[NEWS.CONTENT_HTML]]quot;Final3P90_pct[[NEWS.CONTENT_HTML]]quot;:0.10, [[NEWS.CONTENT_HTML]]quot;BoxP90_pct[[NEWS.CONTENT_HTML]]quot;:0.05, [[NEWS.CONTENT_HTML]]quot;xA90_pct[[NEWS.CONTENT_HTML]]quot;:0.05, [[NEWS.CONTENT_HTML]]quot;KP90_pct[[NEWS.CONTENT_HTML]]quot;:0.05, [[NEWS.CONTENT_HTML]]quot;npxG90_pct[[NEWS.CONTENT_HTML]]quot;:0.10}, [[NEWS.CONTENT_HTML]]quot;GK[[NEWS.CONTENT_HTML]]quot;: {} # 本脚本不评GK } cols_for_pct = [[[NEWS.CONTENT_HTML]]quot;npxG90[[NEWS.CONTENT_HTML]]quot;,[[NEWS.CONTENT_HTML]]quot;xA90[[NEWS.CONTENT_HTML]]quot;,[[NEWS.CONTENT_HTML]]quot;Shots90[[NEWS.CONTENT_HTML]]quot;,[[NEWS.CONTENT_HTML]]quot;KP90[[NEWS.CONTENT_HTML]]quot;,[[NEWS.CONTENT_HTML]]quot;SCA90[[NEWS.CONTENT_HTML]]quot;,[[NEWS.CONTENT_HTML]]quot;GCA90[[NEWS.CONTENT_HTML]]quot;, [[NEWS.CONTENT_HTML]]quot;ProgAct90[[NEWS.CONTENT_HTML]]quot;,[[NEWS.CONTENT_HTML]]quot;Final3P90[[NEWS.CONTENT_HTML]]quot;,[[NEWS.CONTENT_HTML]]quot;BoxP90[[NEWS.CONTENT_HTML]]quot;,[[NEWS.CONTENT_HTML]]quot;DefAct90[[NEWS.CONTENT_HTML]]quot;,[[NEWS.CONTENT_HTML]]quot;Press90[[NEWS.CONTENT_HTML]]quot;,[[NEWS.CONTENT_HTML]]quot;AerialWin%[[NEWS.CONTENT_HTML]]quot;] df = percentiles(df, cols_for_pct + [[[NEWS.CONTENT_HTML]]quot;G-xG[[NEWS.CONTENT_HTML]]quot;], by=[[NEWS.CONTENT_HTML]]quot;pos_grp[[NEWS.CONTENT_HTML]]quot;) scores = [] for idx, row in df.iterrows(): w = weights.get(row[[[NEWS.CONTENT_HTML]]quot;pos_grp[[NEWS.CONTENT_HTML]]quot;], weights[[[NEWS.CONTENT_HTML]]quot;MF[[NEWS.CONTENT_HTML]]quot;]) s = 0.0 for k, alpha in w.items(): v = row.get(k, np.nan) if np.isfinite(v): s += alpha * v scores.append(s) df[[[NEWS.CONTENT_HTML]]quot;KeyScore[[NEWS.CONTENT_HTML]]quot;] = scores return dfdef main(): df = load() # 过滤分钟 df = df[df[[[NEWS.CONTENT_HTML]]quot;minutes[[NEWS.CONTENT_HTML]]quot;] [[NEWS.CONTENT_HTML]]gt;= MIN_MINUTES].copy()
\n
df = per90(df) df = composite_score(df) # 榜单 leaders = (df.sort_values([[[NEWS.CONTENT_HTML]]quot;pos_grp[[NEWS.CONTENT_HTML]]quot;,[[NEWS.CONTENT_HTML]]quot;KeyScore[[NEWS.CONTENT_HTML]]quot;], ascending=[True, False]) .groupby([[NEWS.CONTENT_HTML]]quot;pos_grp[[NEWS.CONTENT_HTML]]quot;) .head(50) .reset_index(drop=True)) # 导出 keep_cols = [[[NEWS.CONTENT_HTML]]quot;player[[NEWS.CONTENT_HTML]]quot;,[[NEWS.CONTENT_HTML]]quot;team[[NEWS.CONTENT_HTML]]quot;,[[NEWS.CONTENT_HTML]]quot;pos[[NEWS.CONTENT_HTML]]quot;,[[NEWS.CONTENT_HTML]]quot;pos_grp[[NEWS.CONTENT_HTML]]quot;,[[NEWS.CONTENT_HTML]]quot;minutes[[NEWS.CONTENT_HTML]]quot;, [[NEWS.CONTENT_HTML]]quot;npxG90[[NEWS.CONTENT_HTML]]quot;,[[NEWS.CONTENT_HTML]]quot;xA90[[NEWS.CONTENT_HTML]]quot;,[[NEWS.CONTENT_HTML]]quot;npxG+xA90[[NEWS.CONTENT_HTML]]quot;,[[NEWS.CONTENT_HTML]]quot;Shots90[[NEWS.CONTENT_HTML]]quot;,[[NEWS.CONTENT_HTML]]quot;KP90[[NEWS.CONTENT_HTML]]quot;,[[NEWS.CONTENT_HTML]]quot;SCA90[[NEWS.CONTENT_HTML]]quot;,[[NEWS.CONTENT_HTML]]quot;GCA90[[NEWS.CONTENT_HTML]]quot;, [[NEWS.CONTENT_HTML]]quot;ProgAct90[[NEWS.CONTENT_HTML]]quot;,[[NEWS.CONTENT_HTML]]quot;Final3P90[[NEWS.CONTENT_HTML]]quot;,[[NEWS.CONTENT_HTML]]quot;BoxP90[[NEWS.CONTENT_HTML]]quot;,[[NEWS.CONTENT_HTML]]quot;DefAct90[[NEWS.CONTENT_HTML]]quot;,[[NEWS.CONTENT_HTML]]quot;Press90[[NEWS.CONTENT_HTML]]quot;,[[NEWS.CONTENT_HTML]]quot;AerialWin%[[NEWS.CONTENT_HTML]]quot;, [[NEWS.CONTENT_HTML]]quot;G-xG[[NEWS.CONTENT_HTML]]quot;,[[NEWS.CONTENT_HTML]]quot;KeyScore[[NEWS.CONTENT_HTML]]quot;] leaders[keep_cols].to_csv(OUT_LEADERS, index=False) # 各队内前5 by_team = (df.sort_values([[NEWS.CONTENT_HTML]]quot;KeyScore[[NEWS.CONTENT_HTML]]quot;, ascending=False) .groupby([[NEWS.CONTENT_HTML]]quot;team[[NEWS.CONTENT_HTML]]quot;) .head(5) .reset_index(drop=True)) by_team[[[[NEWS.CONTENT_HTML]]quot;team[[NEWS.CONTENT_HTML]]quot;,[[NEWS.CONTENT_HTML]]quot;player[[NEWS.CONTENT_HTML]]quot;,[[NEWS.CONTENT_HTML]]quot;pos[[NEWS.CONTENT_HTML]]quot;,[[NEWS.CONTENT_HTML]]quot;KeyScore[[NEWS.CONTENT_HTML]]quot;,[[NEWS.CONTENT_HTML]]quot;npxG+xA90[[NEWS.CONTENT_HTML]]quot;,[[NEWS.CONTENT_HTML]]quot;ProgAct90[[NEWS.CONTENT_HTML]]quot;,[[NEWS.CONTENT_HTML]]quot;DefAct90[[NEWS.CONTENT_HTML]]quot;]].to_csv(OUT_BY_TEAM, index=False) print(f[[NEWS.CONTENT_HTML]]quot;Saved {OUT_LEADERS} and {OUT_BY_TEAM}[[NEWS.CONTENT_HTML]]quot;)if __name__ == [[NEWS.CONTENT_HTML]]quot;__main__[[NEWS.CONTENT_HTML]]quot;: main()
使用方法
- 准备一个
players.csv,至少含这些列:player, team, pos, minutes, goals, assists, shots, key_passes, xG, xA;其余列缺失也可运行。 - 若你的列名不同,把脚本顶部
EXPECTED 右侧的值改成你实际列名。 - 运行:
python key_players.py,会生成 leaders.csv(联赛榜单)与 by_team_top.csv(各队内Top5)。
需要我:
- 直接对接你已有CSV做一次适配与输出图表;
- 帮你抓取指定联赛赛季的数据并清洗成上述格式;
- 调整权重/指标,做专属“关键球员评分”和雷达图。
给我一个样例数据头几行或说明目标联赛与赛季,我就继续完善到可复用的管道。