Filtering markdown rulling files by prompt according to prompt

new · October 19, 2025, 7:36pm

hi friends i have about 180k judgments rullings i want some one to make phyton script for me which will search all folders in main source folder in F:\cho\data\ocr\done\markdown-1970-2025 and then give output files to F:\cho\data\ocr\done\markdown-1970-2025\output these are all legal court judgments and my prompt is folowing i try some codes but when i run my api start giving errors after few files search i dont known why my api gives error so fast F:\cho\data\ocr\grok>python filterofrullings.py
Checking A. Qutubuddin Khan (d_b_a “QMR Expert Consultants”) and others vs CHEC-Millwala Dredging Co. (Pvt.) Ltd_.md
Not favorable → skipped: F:\cho\data\ocr\done\markdown-1970-2025\2025\A. Qutubuddin Khan (d_b_a “QMR Expert Consultants”) and others vs CHEC-Millwala Dredging Co. (Pvt.) Ltd_.md
Checking A.M. Construction Company (Private) Limited vs Taisei Corporation, etc.md
Key gsk_pd0oVWDH hit 429. Removing permanently.
Not favorable → skipped: F:\cho\data\ocr\done\markdown-1970-2025\2025\A.M. Construction Company (Private) Limited vs Taisei Corporation, etc.md

=== Prompt for Filtering ===

BASE_PROMPT = “”"
You are an expert in Pakistani case law.
Evaluate the ruling below ONLY to decide if it clearly supports the case of Pir Salman Zahid
against Khair Muhammad in a property mutation dispute.

Mark “YES” only if the ruling supports one or more of these legal grounds in favor of Pir Salman Zahid:

Limitation Act / Time-Bar
- The appeal, revision, or administrative review was dismissed as time-barred.
- Delay was not condoned because the party had no sufficient cause.
- Court held that limitation is mandatory and jurisdiction cannot be exercised after limitation expires.
Mutation & Power of Attorney Fraud
- Mutation entries alone do not confer ownership when based on fraudulent, forged, or unpaid transactions.
- Fraud or misuse of Power of Attorney vitiates the transaction and the mutation can be cancelled.

Decision:

Reply ONLY with “YES” if the ruling strongly supports any of the above in Pir Salman Zahid’s favour.
Reply ONLY with “NO” if it does not.
“”"

yawnxyz · October 20, 2025, 6:08pm

You’re getting 429 errors because your code is sending too many requests all at once, so our server will rate limit your requests.

I’m not really a python dev but I looked this up; you could use:
pip install aiolimiter tenacity anyio rx

and then you can do something like

import asyncio
import aiohttp
from aiolimiter import AsyncLimiter
from tenacity import retry, wait_random_exponential, stop_after_attempt, retry_if_exception_type

class TransientHttpError(Exception):
  pass

class SafeApiClient:
  def __init__(
    self,
    base_url: str,
    *,
    max_concurrency: int = 8,
    rps: int = 4,
    timeout_s: float = 30.0,
    default_headers: dict | None = None
  ):
    self.base_url = base_url.rstrip("/")
    self.sem = asyncio.Semaphore(max_concurrency)
    self.limiter = AsyncLimiter(max_rate=rps, time_period=1)
    self.timeout = aiohttp.ClientTimeout(total=timeout_s)
    self.default_headers = default_headers or {}
    self._session: aiohttp.ClientSession | None = None

  async def __aenter__(self):
    self._session = aiohttp.ClientSession(timeout=self.timeout)
    return self

  async def __aexit__(self, *exc):
    if self._session:
      await self._session.close()

  async def request(self, method: str, path: str, **kwargs):
    """
    method: "GET"/"POST"/...
    path: "/v1/things" or "v1/things"
    kwargs: passed to aiohttp (json=, params=, data=, headers=, etc.)
    """
    if not self._session:
      raise RuntimeError("Use 'async with SafeApiClient(...) as client:'")

    url = f"{self.base_url}/{path.lstrip('/')}"
    headers = {**self.default_headers, **kwargs.pop("headers", {})}

    @retry(
      retry=retry_if_exception_type(TransientHttpError),
      wait=wait_random_exponential(multiplier=0.2, max=10.0),
      stop=stop_after_attempt(6),
      reraise=True,
    )
    async def _do():
      async with self.limiter:
        async with self.sem:
          async with self._session.request(method, url, headers=headers, **kwargs) as resp:
            # Respect Retry-After on 429/503
            if resp.status in (429, 503):
              ra = resp.headers.get("Retry-After")
              if ra:
                try:
                  delay = float(ra)
                except ValueError:
                  delay = 0
                if delay > 0:
                  await asyncio.sleep(delay)
              raise TransientHttpError(f"{resp.status} {await _safe_snippet(resp)}")

            # Retry on typical transient 5xx
            if 500 <= resp.status < 600:
              raise TransientHttpError(f"{resp.status} {await _safe_snippet(resp)}")

            # Raise for other >= 400
            if 400 <= resp.status:
              text = await resp.text()
              raise aiohttp.ClientResponseError(
                request_info=resp.request_info,
                history=resp.history,
                status=resp.status,
                message=text,
                headers=resp.headers,
              )

            # Auto-decode JSON when possible
            ctype = resp.headers.get("Content-Type", "")
            if "application/json" in ctype:
              return await resp.json()
            return await resp.text()

    return await _do()

async def _safe_snippet(resp: aiohttp.ClientResponse, max_len: int = 200) -> str:
  try:
    text = await resp.text()
    return text[:max_len]
  except Exception:
    return "<no-body>"

# ---- example usage ----
# async def main():
#   async with SafeApiClient("https://api.example.com", default_headers={"Authorization": "Bearer TOKEN"}) as client:
#     # up to 8 in-flight, 4 requests/sec total
#     tasks = [
#       asyncio.create_task(client.request("GET", f"/v1/items/{i}"))
#       for i in range(20)
#     ]
#     results = await asyncio.gather(*tasks, return_exceptions=True)
#     print(results)
#
# asyncio.run(main())

again, I’m not really a python dev, but this (from ChatGPT) is the python equivalent of what I usually do in JS — adding async semaphores and debouncer; this rune 8 concurrent calls with 4 requests per second, with retries.

Try it out!

Topic		Replies	Views
🚀 Reminder: MCP + Responses API Bounty Challenge! (due 11/17) Announcements	0	40	October 27, 2025
Prompt Caching Feature Requests	5	91	October 27, 2025
Search model Forum	3	78	September 4, 2025
Qwen3-Coder-30B-A3B Feature Requests	2	102	August 8, 2025
Qwen3 Coder & Qwen3 235B A22B Thinking 2507 Feature Requests	8	191	August 24, 2025

Filtering markdown rulling files by prompt according to prompt

=== Prompt for Filtering ===

Related topics