Research Project

Bots in Chilean Political
YouTube

An analysis of automated accounts in the comment sections of Chilean political news videos, inspired by the FactCheck LT study (March 2025).

View source & peer-review on GitHub

17,636

Comments Analyzed

30

Videos Analyzed

10,315

Unique Accounts

202

Bot Accounts Detected

Methodology

How we collected data and detected bots

Data Collection

  • Channel scanning: scrapetube fetches all videos from Chilean news channels
  • Political filtering: 128 curated Spanish keywords (institutions, figures, parties, topics) filter political videos
  • Comment scraping: youtube-comment-downloader collects all comments without the YouTube API

Justification: FactCheck LT

Our approach is inspired by the FactCheck LT study (March 2025), which analyzed 94,532 comments across 111 channels and found:

  • Bots comprised <1% of accounts
  • But generated 11.6% of all comments
  • Active across 38.8% of analyzed videos

We apply the same account-vs-volume distinction: a user is a bot account if any of their comments score above 0.5, then all their comments count as bot-generated.

Bot Detection Heuristics

Each comment is scored 0.0 to 1.0. Scores above 0.5 flag the comment as bot-generated. Signals stack additively, capped at 1.0.

Username Patterns

Auto-generated names ("@user-abc123"), excessive digits, random strings with no real-name pattern.

up to +0.35

Positive Astroturfing

Generic praise without substance ("Excelente!", "Tiene toda la razon"), promotional spam URLs.

up to +0.40

Negative / Attack Bots

Single-word political insults as entire comments, ALL CAPS rage, repetitive spam, emoji floods.

up to +0.40

Propaganda

Copy-paste slogans posted by different users, unusually formal tone with no colloquialisms.

up to +0.30

Cross-Video Behavior

Same user posting identical or near-identical comments across multiple videos (Jaccard similarity > 0.6).

up to +0.50

Most Used Words

Top 200 words across all comments (Spanish stopwords filtered)

Bot Detection Results

FactCheck-style account vs. comment volume analysis

1.96%

of accounts are bots

8.12%

of comments are by bots

7.1x

more active than humans (7.1 vs 1.6 avg)

Accounts vs Comment Volume

Bot Categories

Bot Percentage by Video (top 20 most affected)

Political Leaning

Keyword-based classification of comments into left, right, or neutral

Overall Distribution

1.49%

Left-leaning (263 comments)

4.86%

Right-leaning (857 comments)

93.65%

Neutral / unclassified (16,516 comments)

Political Leaning by Video (top 20 most politically active)

Top Suspected Bot Accounts

Top 20 accounts ranked by maximum bot score

#AccountMax ScoreTotal CommentsFlaggedVideosAvg Score
1@RodrigoLarenasSolari0.956460.608
2@pablobaltodano19820.856660.642
3@rosavalenzuela96000.808160.375
4@jimenaserranoosses32280.752220.750
5@cokeriesko0.70295150.393
6@nicolasdoxrud68800.702514200.480
7@MariaFuentes-lf6wg0.7014460.414
8@rosapincheira87520.7010170.360
9@omarravanalorellana41510.7012270.383
10@marisoladasme52530.709260.378
11@camilocarreno76850.703120.400
12@victorjorqueramolina61570.707670.657
13@CarlosJotazeta0.70222110.364
14@a.fuentes12370.709270.400
15@miguelgomez-cm6ii0.703130.367
16@mariazamorano91820.705150.380
17@JonathanRaffo-v1y0.701410130.486
18@791070.656130.417
19@daniellllll454190.657270.450
20@cb40170.657260.379

Conclusions

Summary of findings and limitations

Key Findings

  • 1.96% of accounts generate 8.12% of comments. This is consistent with the FactCheck LT study (<1% of accounts, 11.6% of comments), confirming that bot activity in Chilean political YouTube follows similar patterns to international findings.
  • Bots are 4.4x more active. Each bot account averages 7.1 comments vs 1.6 for human accounts, often posting the same or very similar text across multiple videos.
  • Cross-video duplication is the strongest signal. Users who post identical comments across different videos are overwhelmingly likely to be automated. This aligns with the Levenshtein-based duplicate detection used by the YT-Spammer-Purge project.
  • Both political sides are targeted. Bot categories include astroturfing (positive support), attack bots (negative insults), and propaganda (copy-paste slogans), suggesting orchestrated campaigns rather than organic behavior.

Limitations

  • Heuristic-based, not ML-based. Our bot detection uses curated keyword lists and behavioral signals, not trained classifiers. False positives are possible for passionate users who comment frequently.
  • Political leaning is approximate. Left/right classification uses keyword co-occurrence, not sentiment analysis. A comment saying "los comunistas destruyen Chile" is classified as right-leaning based on the word "comunistas", which is correct in context but not nuanced.
  • Dataset scope. The analysis covers a sample of Chilean political channels, not the entirety of Chilean YouTube. Results may not generalize to all political content.
  • Temporal snapshot. Comments were scraped at a single point in time. Bot activity may fluctuate around elections or political events.

Comparison with FactCheck LT

MetricOur StudyFactCheck LT
Bot accounts (% of users)1.96%<1%
Bot comment volume (% of comments)8.12%11.6%
Comments analyzed17,63694,532
Full source code, data & methodology on GitHub

Research by Maximiliano Militzer · Built with Next.js, youtube-comment-downloader, scrapetube · Data analyzed with Python