01-ai/yi-34b-chat 🔢📝 → 📝

▶️ 320.3K runs 📅 Nov 2023 ⚙️ Cog 0.8.6 🔗 GitHub ⚖️ License
code-generation question-answering text-generation text-translation

About

The Yi series models are large language models trained from scratch by developers at 01.AI.

Example Output

Prompt:

"Write a script to download the images for the top 10 posts of all time from /r/pics using the PRAW library"

Output

To create a script that downloads the images for the top 10 posts of all time from r/pics, you'll need to use the PRAW (Python Reddit API Wrapper) library. Below is an example script that demonstrates how to do this:

import praw
from urllib import request
import os
import io
import json

# Your reddit account must be authorized as described here: https://praw.readthedocs.io/en/latest/getting_started/authentication.html
reddit = praw.Reddit(client_id='your-client-id', client_secret='your-client-secret', username='your-username', password='<PASSWORD>')

# Get the top submissions in 'all' category on r/pics
submissions = reddit.subreddit('pics').top('all', limit=10)  # Adjust 'limit' parameter according to your needs

for submission in submissions:
   if not submission.stickied and submission.is_image():
       print(f"Downloading image from {submission.url}")
       try:
           response = request.urlopen(submission.url)
           imgdata = response.read()
           with open(os.path.basename(submission.url), 'wb') as fh:
               fh.write(imgdata)
       except Exception as e:
           print(e)
           continue

# If you want to save the metadata along with the images, you can also scrape some information from each post
# For example:
metadata = []
for submission in submissions:
   try:
       post_dict = {}
       post_dict['title'] = submission.title
       post_dict['score'] = submission.score
       post_dict['permalink'] = submission.permalink
       post_dict['author'] = submission.author
       post_dict['url'] = submission.url
       metadata.append(post_dict)
   except Exception as e:
       print(e)
       continue

# Save the metadata to a file if desired
with open('pics_posts_metadata.json', 'w') as outfile:
   json.dump(metadata, outfile)

Please replace 'your-client-id', 'your-client-secret', 'your-username', and 'your-password' with your actual credentials obtained from Reddit when setting up your application.

This script does the following:

  1. Authenticates with Reddit using OAuth2 via PRAW.
  2. Fetches the top 10 non-sticky submissions from r/pics sorted by "all time" popularity.
  3. Checks if each submission is an image post (submission.is_image() returns True).
  4. Downloads the image from the URL provided by the submission object.
  5. Saves the downloaded image files locally with their original names.
  6. Optionally scrapes additional metadata such as title, score, permalink, author, and url for each post.
  7. Saves the metadata to a JSON file named pics_posts_metadata.json.

Make sure to install the PRAW library before running the script:

pip install praw

Also note that downloading content without proper attribution may violate copyright laws or go against Reddit's terms of service. Always ensure that you have permission to download and use the content, and consider giving credit to the original authors where appropriate.

Performance Metrics

47.80s Prediction Time
50.00s Total Time
All Input Parameters
{
  "top_k": 50,
  "top_p": 0.8,
  "prompt": "Write a script to download the images for the top 10 posts of all time from /r/pics using the PRAW library",
  "temperature": 0.3,
  "max_new_tokens": 1024,
  "prompt_template": "<|im_start|>system\nYou are a helpful assistant<|im_end|>\n<|im_start|>user\n{prompt}<|im_end|>\n<|im_start|>assistant\n",
  "repetition_penalty": 1.2
}
Input Parameters
top_k Type: integerDefault: 50
The number of highest probability tokens to consider for generating the output. If > 0, only keep the top k tokens with highest probability (top-k filtering).
top_p Type: numberDefault: 0.8
A probability threshold for generating the output. If < 1.0, only keep the top tokens with cumulative probability >= top_p (nucleus filtering). Nucleus filtering is described in Holtzman et al. (http://arxiv.org/abs/1904.09751).
prompt (required) Type: string
temperature Type: numberDefault: 0.3
The value used to modulate the next token probabilities.
max_new_tokens Type: integerDefault: 1024
The maximum number of tokens the model should generate as output.
prompt_template Type: stringDefault: <|im_start|>system You are a helpful assistant<|im_end|> <|im_start|>user {prompt}<|im_end|> <|im_start|>assistant
The template used to format the prompt. The input prompt is inserted into the template using the `{prompt}` placeholder.
repetition_penalty Type: numberDefault: 1.2
Repetition penalty
Output Schema

Output

Type: arrayItems Type: string

Version Details
Version ID
914692bbe8a8e2b91a4e44203e70d170c9c5ccc1359b283c84b0ec8d47819a46
Version Created
November 23, 2023
Run on Replicate →