Development¶
Prerequisites¶
- Java 17 (JDK)
- Maven 3.9+
- Node.js 18+
- PostgreSQL 15+ (or Docker)
Project Structure¶
.
├── shared/ # JPA entities, repositories, shared services
├── pipeline-service/ # Spring Boot — scan, extract, analyze, generate
│ ├── src/main/java/com/topicscanner/
│ │ ├── api/ # REST controllers
│ │ ├── scanner/ # SourceScanner SPI + implementations
│ │ ├── extraction/ # Content extraction
│ │ ├── analyzer/ # Classification and analysis
│ │ ├── generator/ # Content generation + style analysis
│ │ ├── queue/ # PostgreSQL job queue
│ │ ├── llm/ # LLM providers (Ollama, OpenAI, Claude)
│ │ ├── filter/ # 7-stage filter chain
│ │ └── config/ # Spring configuration
│ └── Dockerfile
├── webui-nodejs/ # Next.js 14 frontend
│ ├── app/ # Pages (App Router)
│ ├── components/ # UI components
│ ├── lib/ # API client, utilities
│ └── Dockerfile
├── helm/cloud-native-scanner-v2/ # Helm chart
├── .github/workflows/ci.yaml # CI pipeline
├── docs/ # This documentation
└── pom.xml # Maven parent POM
Building¶
Java (Maven)¶
# Build all modules
mvn clean install -DskipTests
# Build only shared + pipeline-service
mvn package -pl shared,pipeline-service -am -DskipTests
# Run tests
mvn verify -pl shared,pipeline-service -am
# Run pipeline-service locally
mvn spring-boot:run -pl pipeline-service
Frontend (Next.js)¶
cd webui-nodejs
npm install # Install dependencies
npm run dev # Dev server at http://localhost:3000
npm run build # Production build
npm run lint # ESLint
npx tsc --noEmit # Type check
npm test # Jest tests
Running Locally¶
1. Start PostgreSQL¶
Or use a local PostgreSQL instance. Create a database:
CREATE DATABASE topicscanner;
CREATE USER topicscanner WITH PASSWORD 'topicscanner';
GRANT ALL PRIVILEGES ON DATABASE topicscanner TO topicscanner;
Enable pgvector (optional, for embedding features):
2. Start pipeline-service¶
Flyway will create all tables on first run. The API is at http://localhost:8080.
3. Start the frontend¶
The UI is at http://localhost:3000. It proxies API calls to http://localhost:8080.
4. (Optional) Start Ollama¶
Adding a New Scanner¶
See CONTRIBUTING.md for the step-by-step guide.
In brief:
- Create a class implementing
SourceScannerwith@Component - Implement
getSourceType(),getDisplayName(), andscan(ScanRequest) - Optionally override
extractContent(url)for custom extraction - Write tests with
MockWebServer
CI Pipeline¶
The GitHub Actions CI (ci.yaml) runs:
- Java —
mvn verify -pl shared,pipeline-service -am - Next.js — lint, type check, build
- Docker — build + push both images to GHCR (main branch only)
- Helm — lint the v2 chart