Skip to main content

Prerequisites

Before you begin, ensure you have:
  • Node.js 18.x or higher
  • npm or pnpm package manager
  • Siigo Nube account credentials with access to Corprecam (NIT 900142913) and/or Reciclemos (NIT 901328575)
  • Ngrok authentication token (get one at ngrok.com)
  • Access to the Corprecam administrative API endpoints

Installation

1

Clone the Repository

Clone the source code to your local machine:
git clone <repository-url>
cd playwright-corprecam
npm install
2

Install Playwright Browsers

The scraper uses Firefox. Install it with:
npx playwright install firefox
3

Configure Environment Variables

Create a .env file in the project root with the following configuration:
.env
# Server Configuration
PORT=3000

# Ngrok Configuration
NGROK_AUTHTOKEN=your_ngrok_token_here

# Siigo Credentials
USER_SIIGO_CORPRECAM=your_siigo_username
PASSWORD_SIIGO_CORPRECAM=your_siigo_password

# Database Configuration (for PHP API integration)
DB_HOST=localhost
DB_USER=your_database_user
DB_PASSWORD=your_database_password
DB_DATABASE=corprecam_db
DB_PORT=3306
The database credentials are used by the PHP backend API endpoints. If you’re only testing the scraper functionality, you can use placeholder values.
4

Start the Server

Launch the development server with auto-reload:
npm run dev
You should see output like:
Server running on port 3000
The server automatically creates an Ngrok tunnel and registers the public URL with the Corprecam backend.

Test the Scraper

Once the server is running, test it with a purchase order:

Using curl

curl -X POST http://localhost:3000/scrapping \
  -H "Content-Type: application/json" \
  -d '{"compra": "12345"}'

Using Postman

1

Create a POST Request

Set the request type to POST and URL to:
http://localhost:3000/scrapping
2

Set Headers

Add header:
Content-Type: application/json
3

Set Request Body

Choose “raw” format and enter:
{
  "compra": "12345"
}
Replace 12345 with a valid purchase order code from your Corprecam system.
4

Send Request

Click Send. A Firefox window will open automatically and you’ll see the scraper:
  • Log into Siigo Nube
  • Navigate to Documento Soporte creation
  • Fill in the supplier and consecutive number
  • Add each product line item
  • Select payment account
  • Close the browser
The API will respond with:
{
  "message": "ok"
}
The scraper runs in non-headless mode (headless: false), so you’ll see the browser window during execution. This is intentional for debugging and monitoring.

What Happens Behind the Scenes

When you POST to /scrapping, the system:
  1. Fetches Purchase Order Data from the Corprecam PHP backend:
    const compra = await getCompras(body.compra);
    const compraItems = await getCompraItems(body.compra);
    const materiales = await getMateriales(citem_material);
    const micro = await getMicro(Number(compra[0].com_micro_ruta));
    
  2. Transforms Data into a structured format:
    const ds = transfromDs(compra[0], compraItems, materiales, micro);
    // Result:
    // {
    //   proveedor_id: "1234567890",
    //   micro_id: "Route Name",
    //   corprecam: [{ codigo: "MAT001", cantidad: 10, precio: 5000 }],
    //   reciclemos: [{ codigo: "MAT002", cantidad: 5, precio: 3000 }]
    // }
    
  3. Executes Playwright Automation for each company with items:
    if (documentoSoporte.corprecam.length > 0) {
      await playwright_corprecam_reciclemos(
        documentoSoporte.corprecam,
        "25470", // Documento Soporte type code
        " BODEGA DE RIOHACHA ",
        " CAJA RIOHACHA ", // Payment account
        documentoSoporte.proveedor_id,
        config.USER_SIIGO_CORPRECAM,
        config.PASSWORD_SIIGO_CORPRECAM,
        "900142913" // Corprecam NIT
      );
    }
    
  4. Returns Success Response after all documents are created.

Understanding the Response Flow

Successful Execution

{
  "message": "ok"
}
This means all Documento Soporte documents were created successfully in Siigo.

Error Scenarios

If the scraper encounters errors (e.g., invalid credentials, network issues, or Siigo UI changes), the process will retry operations automatically using the retryUntilSuccess utility.
Check the server console logs for detailed execution information. Each major step logs progress messages.

Configuration Details

Port Configuration

The server listens on port 3000 by default. Change it in .env:
PORT=8080

Document Type Code

The scraper uses document type code 25470 for Documento Soporte. This is defined in main.ts:6:
const documentoSoporteLabelCode = "25470";

Warehouse and Payment Accounts

Configured in main.ts:8-11:
const bodegaRiohacha = " BODEGA DE RIOHACHA ";
const cuentaContableCorprecam = " CAJA RIOHACHA ";
const cuentaContableReciclemos = " Efectivo ";
Note the spaces in the strings. These match the exact text in Siigo’s dropdown menus.

Company NITs

  • Corprecam: 900142913
  • Reciclemos: 901328575
These are hardcoded in main.ts:25 and main.ts:38.

Remote Access via Ngrok

The server automatically creates a public URL when it starts:
const listener = await ngrok.forward({
  addr: 3000,
  authtoken: config.NGROK_AUTHTOKEN,
});

await setNgrok(listener.url());
This allows the Corprecam administrative panel to trigger the scraper remotely. The public URL is registered with the backend via the setNgrok() function.

Next Steps

API Reference

Learn about request/response formats

Configuration

Customize warehouse, accounts, and document types

Troubleshooting

Debug common issues

Architecture

Understand the system design

Development Mode

The npm run dev command uses Node’s --watch flag for automatic reload on file changes:
"scripts": {
  "dev": "node --watch server.ts"
}
Edit any .ts file and the server restarts automatically.

Verifying Installation

After starting the server, verify:
  1. Server logs show Server running on port 3000
  2. No error messages about missing dependencies
  3. Ngrok connection established (check Ngrok dashboard)
  4. Test request to /scrapping opens Firefox window
If you encounter issues, see the Troubleshooting guide.

Build docs developers (and LLMs) love