Argos: AI-Powered Counterfeit Detection for NAP Solutions
A new AI-powered crawling platform replacing NAP's existing Artemis scraper with intelligent pre-filtering, image-based detection, and confidence scoring - reducing manual review workload by 70–80%.
Team
What is Argos?
Argos is a standalone, AI-powered counterfeit detection platform. It crawls e-commerce platforms, uses a multi-model AI engine (visual + text) to score listings for potential infringement, and presents results to your QA team for review. Fully independent infrastructure - no dependencies on legacy Artemis.
Detection Pipeline
Multi-Model AI Engine
Confidence Scoring
| Score | Action |
|---|---|
| 70 – 100 | High Priority - QA review required |
| 40 – 69 | Review - Optional QA review |
| 0 – 39 | Auto-filtered - Excluded |
Tech Stack
Scope
In Scope (This Engagement)
- ✓ Amazon.com product listing crawling & detection
- ✓ Shopify / branded store crawling & detection
- ✓ AI-powered image + text scoring engine
- ✓ QA review dashboard & admin panel
- ✓ REST API, CSV export (ARTEMIS-compatible), webhooks
- ✓ Cross-platform matching (Amazon ↔ Shopify)
- ✓ Side-by-side validation against legacy system
- ✓ Full source code handover, docs & training
- ✓ 30-day post-handover support
Out of Scope (Evaluated, Not Included)
- ✗ Walmart, Temu, Alibaba, AliExpress crawling (future phases)
- ✗ Regional Amazon domains (.de, .co.uk, etc.) - only .com
- ✗ Smart Search / historical learning from enforcement decisions
- ✗ Dashboard & Statistics module (advanced analytics)
- ✗ Crawling Priorities & automated scheduling (manual trigger only)
- ✗ Automated enforcement actions (detection & review only)
- ✗ Legacy Artemis modifications or decommissioning
- ✗ Ongoing hosting & AI costs (passed to NAP at cost)
- ✗ Extended support beyond 30-day window (retainer available)
Users & Roles
- Create & configure campaigns
- Set keywords, images, filters, whitelists
- Adjust detection thresholds per category
- Manage users and assign roles
- View analytics & performance dashboards
- Set crawling frequency & priorities
- Review AI-flagged listings in dashboard
- Accept or reject with reasoning
- Filter by date, campaign, country, score
- Export approved results to CSV
- Decisions train the AI over time
- Token-based API authentication
- Query detection results programmatically
- Utilize all available filters
- Integrate with existing NAP tools
- Webhook notifications for new results
User Stories
Scope & Phases
Deliverables
- Automated Amazon.com product listing crawler
- Rate-limit management & anti-bot handling
- AI-powered image comparison (multi-model: GPT-4o Vision, Claude, Gemini)
- NLP text analysis (Fuzzball fuzzy matching + CLIP pre-screening)
- Confidence scoring with AI reasoning
- Campaign configuration (keywords, images, filters, whitelist)
- Seller location filtering
- QA review dashboard with accept/reject workflow
- REST API for integration + CSV export
- Alert & notification system
Weekly Breakdown
Deliverables
- Shopify store discovery and crawling engine
- Cross-platform matching (Amazon ↔ Shopify)
- Shopify-specific heuristics (store age, reviews, pricing)
- Unified dashboard across both platforms
- API User management (tokens, access control)
- Seller location: best-effort (Shopify limitation)
Weekly Breakdown
Deliverables
- Side-by-side testing (min 1,000 listings/day for 10 days)
- Shadow mode → Supervised mode validation
- Complete source code handover (Git transfer)
- Architecture, API (Swagger), deployment docs
- Operations runbook & data dictionary
- Training: Ops (2x2hr), Technical (2x2hr), Platform (1x3hr)
- 30-day post-handover support
Validation Process
Project Timeline
Success Metrics & KPIs
Phase 1 Success Criteria (Amazon)
Phase 2 Success Criteria (Shopify)
Phase 3 Success Criteria (Migration)
Adjustable Thresholds
All AI confidence thresholds - including image similarity, text matching sensitivity, and overall scoring - are configurable per category. No code changes required; admins can tune thresholds directly from the dashboard.
Risks & Mitigation
What We Need from NAP Solutions
Access & Credentials
- Access to existing Artemis scraper source code
- Sample datasets (known counterfeit + known legitimate listings)
- Campaign guidelines / matrix (Miffy as pilot)
- API keys or account access for testing platforms
Infrastructure
- GCP billing account for ongoing AI & hosting costs
- Decision on proxy service tier (ScraperAPI)
- Preferred communication channel (Slack, Teams, or email)
Ongoing Collaboration
- Designate primary technical contact (Ed / IT team)
- Weekly sync availability (30-60 min)
- Timely feedback on sprint demos (within 2-3 days)
- QA team available for validation phase
Sign-offs Required
- Phase 1 acceptance (before Phase 2 begins)
- Validation pass/fail criteria confirmation
- Migration sign-off (before legacy standby period)
- Final handover acceptance
Next Steps
How We Work
Sprint Cadence
2-week sprints with a demo at the end of each. Daily async status updates keep both sides aligned between demos.
Communication
Weekly updates via email or preferred channel. Ad-hoc communication available anytime - we're reachable beyond email.
Progress Reports
Regular accomplishment reports on completed tasks, upcoming work, and blockers. Transparent sprint velocity tracking.
Escalation Protocol
Any risk impacting timeline by more than 1 week is escalated within 24 hours with a proposed mitigation plan. No surprises.
Detailed Project Timeline
23 weeks from kickoff to handover, including a 2-week buffer.
Sprint Targets & Success Criteria
2-week sprints. Each sprint ends with a live demo to NAP stakeholders. Progress is measured by what you can see and test, not by tasks completed internally.
- Live GCP environment with CI/CD pipeline running
- Database schema walkthrough (ERD on screen)
- User login flow working (admin + QA roles)
- Miffy campaign test data loaded into system
- Admin can log in and see empty dashboard
- CI/CD deploys to staging on git push
- Sample Miffy data visible in DB
- By Day 3: Artemis source code access - we need to understand the legacy schema to design ours
- By Day 5: Sample datasets (known counterfeit + known legitimate) - required for Sprint 4 AI training
- By Day 5: Miffy campaign matrix/guidelines - this is our pilot campaign
- By Week 1: GCP billing account linked - blocks all infrastructure provisioning
- Admin creates a Miffy campaign with keywords + reference images
- Role-based access: Admin sees everything, QA sees only review queue
- Background job queue processing visible in admin panel
- Admin can CRUD campaigns through the API
- JWT auth + RBAC enforced on all endpoints
- Job queue accepts crawl requests (even if crawler isn't built yet)
- Campaign configuration UI (keywords, images, filters, whitelist)
- QA review dashboard with mock AI-scored results
- Admin dashboard with placeholder analytics
- Seller location filtering UI
- NAP team can log in and navigate all screens
- Campaign creation works end-to-end (UI → API → DB)
- QA reviewer can accept/reject mock listings
- Live crawl of Miffy campaign on Amazon.com
- AI-scored results with confidence scores + reasoning
- Side-by-side: Argos results vs. what Artemis would have found
- First accuracy metrics (false positive/negative rates)
- Crawler successfully processes 100+ Amazon listings
- AI returns confidence scores for each listing
- QA reviewer can review real results in the dashboard
- No crawler blocks/bans from Amazon for 48+ hours
- Anti-bot detection - Amazon may block crawlers. ScraperAPI + proxy rotation mitigates, but first contact is unpredictable
- AI accuracy - First real-world test of the scoring engine. May need threshold tuning
- Depends on Sprint 1 data - If sample datasets were delayed, AI training quality suffers here
- Complete flow: campaign → crawl → AI score → QA review → CSV export
- CSV format validated against ARTEMIS compatibility
- Alert system: notifications when crawl completes, budget thresholds
- REST API endpoints documented (Swagger)
- QA team can do a full mock review session (10+ listings)
- CSV export opens correctly in NAP's existing tools
- API returns data matching dashboard view
- Full system demo with live Amazon data
- KPI dashboard: false positive/negative rates, crawl-to-review time, precision
- System running 5+ consecutive days with no P1/P2 bugs
- 2+ NAP team members independently operating the platform
- False positive/negative rates ≤ 5%
- Crawl-to-review ≤ 15 min (up to 5K listings)
- Precision ≥ 95%
- Miffy campaign produces actionable results
- NAP formal sign-off
- Shopify store discovery engine running
- First Shopify-sourced listings with AI scores
- Shopify-specific heuristics (store age, reviews, pricing anomalies)
- Crawler discovers and processes Shopify stores
- Results appear in QA dashboard alongside Amazon results
- Platform source clearly labeled per listing
- Cross-platform matching: Amazon ↔ Shopify linked sellers
- Unified dashboard with platform filter
- API User management (token generation, access control)
- Multi-platform sellers flagged automatically
- Dashboard filters by platform, campaign, date, score
- Phase 2 acceptance criteria met
- Phase 2 acceptance demo (Shopify + cross-platform)
- Side-by-side comparison dashboard: Argos vs. legacy Artemis
- Shadow mode activated - Argos running alongside legacy
- Phase 2 signed off by NAP
- Shadow mode processing 1,000+ listings/day
- Automated comparison report generating daily
- 10-day blind evaluation results: Argos vs. legacy on all KPIs
- Supervised mode: 100% human review with legacy as fallback
- Architecture docs, API (Swagger), deployment runbook
- Training sessions: Ops (2×2hr), Technical (2×2hr), Platform (1×3hr)
- Argos matches/exceeds legacy on all KPIs for 5+ business days
- All documentation delivered and accepted
- NAP team completes all 3 training tracks
- NAP team operates independently (shadowed by Symph)
- Complete source code transferred (Git repo handover)
- Operations runbook & data dictionary
- NAP team running system independently for 2+ weeks
- 30-day post-handover support period begins
- All code, docs, credentials transferred
- NAP confirms independent operation capability
- Legacy standby period begins (30 days, rollback within 4hr)
- Final sign-off from NAP
Weekly Updates
Weekly status reports will appear here as the project progresses.
Investment & Payment Schedule
Milestone-based payments. Each requires written acceptance.
Ongoing Costs (Post-Launch)
AI Inference Costs (monthly)
| Volume | Estimated Cost |
|---|---|
| 10,000 listings | $120 - $180 |
| 50,000 listings | $450 - $650 |
| 100,000 listings | $800 - $1,100 |
| 500,000+ listings | $3,200 - $4,500 |
Infrastructure (monthly, at cost)
| Service | Estimated Cost |
|---|---|
| Compute (Cloud Run/GKE) | $30 - $80 |
| Database (PostgreSQL) | $10 - $40 |
| Storage (GCS) | $5 - $20 |
| Hosting subtotal | $45 - $140/mo |
| ScraperAPI (proxy, separate) | ~$49/mo |
Documents
Project documents, links, and shared resources will be organized here.